Skip to content

CosmonautCode/Tiny-LLM-Ecosystem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 

Repository files navigation

The Tiny Local LLM Ecosystem

A Series of Self-Contained Systems for Budget-Conscious Developers

Python License Privacy First Offline Ready Made with UV


What is the Tiny Local LLM Ecosystem?

The Tiny Local LLM Projects is an ambitious, open-source initiative to democratize access to large language models for developers with limited hardware budgets and resources. Rather than requiring expensive GPUs or cloud subscriptions, this ecosystem provides a progression of self-contained, lightweight Python applications that run entirely locally on consumer-grade hardware.

Each project in this series is designed to be:

  • Minimal: Only essential dependencies, optimized for small footprints
  • Accessible: Works on modest CPUs; no GPU required
  • Private: All processing happens locally; no data leaves your machine
  • Self-Contained: Includes pre-downloaded models; works offline
  • Progressive: Start simple and graduate to advanced multi-agent systems

The Vision

I believe that AI capabilities shouldn't be gatekept behind expensive infrastructure or cloud subscriptions. Whether you're a hobbyist, student, small business owner, or developer in resource-constrained regions, you should have access to functional LLM systems.

This ecosystem creates a learning pathway where:

  1. Beginners can start with basic chat interfaces
  2. Intermediate developers can explore multi-personality agents and reasoning systems
  3. Advanced users can build sophisticated agent systems with knowledge graphs and tools
  4. All without leaving their local machine or spending on API credits

What's Been Built

1. Tiny Local LLM System

Repository: Tiny-Local-LLM-System/

  • Basic local LLM chat interface with TinyLlama-1.1B model
  • Rich console UI with syntax highlighting and formatted responses
  • Configurable settings: temperature, top-p, max tokens, system prompt
  • One-click launch: Double-click run_app.bat to start chatting
  • Model included: Pre-downloaded GGUF model for immediate use
  • Stack: Python + llama-cpp-python + Rich + UV
  • Status: Complete and fully functional

2. Tiny Local LLM System Expanded

Repository: Tiny-Local-LLM-System-Expanded/

  • Multi-personality agent selection with 3 pre-defined personas
  • Agent configuration via JSON: agents.json defines personalities and system prompts
  • Enhanced user experience: Choose your LLM personality on startup
  • Builds on foundation: Extends the basic system with customization layer
  • Status: Complete and fully functional

3. Tiny Local Multi-Agent System

Repository: Tiny-Local-Multi-Agent-System/

  • Advanced multi-expert orchestration using Qwen2.5-1.5B model
  • 3-Phase pipeline: Technical spec extraction → specialist opinions → synthesis
  • Multiple specialized agents: Each with custom roles and expertise
  • Comprehensive reporting: Synthesizes diverse perspectives into unified specs
  • Modular architecture: Agent manager, query processor, LLM engine
  • Token usage tracking: Monitor model efficiency and costs
  • Status: Complete and fully functional

4. Tiny Hyena with Tools

Repository: Hyena/ Under Development

  • AI Agent System with multiple specialized personalities
  • File operations with secure workspace management
  • Dynamic tool calling with @read_file(), @write_file(), @list_files()
  • Rich console interface with responsive terminal sizing
  • Conversation management with save/load functionality
  • Workspace security with path restrictions and file size limits
  • Modular architecture for easy extension
  • Git LFS Integration: Large model files tracked efficiently
  • Use case: AI chat assistants, file management, coding help, documentation
  • Current features:
    • Multiple AI agents (Hyena, General Helper, Analyzer, Reviewer, Researcher, Code Expert)
    • Real-time file operations with tool integration
    • Rich terminal UI with dynamic sizing
    • Secure workspace management
    • Git LFS integration for large model files (*.gguf, *.bin, *.safetensors)
  • Model: Hyena3-4B-Instruct with llama-cpp-python
  • Status: Available But Under Development
  • Note: This is a complete, functional system ready for production use

5. Hyena-AI CLI System

Repository: Hyena-AI/ Complete and Production Ready

  • Advanced AI CLI system with agentic tool loop and auto-memory
  • Claude Code CLI clone with Hyena branding and local LLM
  • Auto-Memory System: Conversations auto-save, AI extracts insights, context injection
  • Agentic Tool Loop: AI plans and executes multi-step operations with tool calls
  • Permission System: Y/N/Always/Never approval for dangerous operations
  • Rich Terminal UI: Modern interface with streaming responses and tool panels
  • No Manual Saves: Everything auto-persists to .hyena/ directory
  • Modular Architecture: 22 focused modules under 200 lines each
  • Google Standards: PEP 8 compliance following Google style guidelines
  • Git LFS Integration: Large model files tracked efficiently (*.gguf, *.bin, *.safetensors)
  • Use case: Professional AI development, code assistance, documentation, research
  • Current features:
    • Complete agentic loop with tool planning and execution
    • Auto-memory system with conversation persistence and insight extraction
    • Permission-based tool execution with user approval workflow
    • Rich streaming interface with live markdown rendering
    • Comprehensive command system (/help, /memory, /tools, /status, etc.)
    • Project-based memory with context injection
    • Session management and conversation compaction
    • Full workspace integration with secure file operations
  • Model: Hyena3-4B-Instruct with llama-cpp-python
  • Status: Available But Under Development
  • Note: This is the most advanced system in the ecosystem, ready for professional use

What's Coming Next

Phase 2: Knowledge & Retrieval Systems

6. Tiny Local LLM with Knowledge Graph

Repository: Tiny-Local-LLM-with-Knowledge-Graph/ (Coming Soon)

  • Knowledge graph integration for semantic relationships
  • Entity extraction and linking from user queries and documents
  • Graph visualization of interconnected knowledge
  • Contextual reasoning informed by knowledge structure
  • Semantic queries against graph relationships
  • Use case: Document analysis, research synthesis, domain-specific reasoning
  • Expected features:
    • Graph database integration (lightweight option like NetworkX or Neo4j)
    • Entity recognition and linking
    • Relationship inference
    • Path-based reasoning for complex queries

7. Tiny Local LLM with Vector Store

Repository: Tiny-Local-LLM-with-Vector-Store/ (Coming Soon)

  • RAG (Retrieval-Augmented Generation) pipeline
  • Vector embeddings for semantic search
  • Document ingestion with recursive chunking
  • Similarity-based retrieval for context augmentation
  • Persistent vector store for document collections
  • Use case: Document Q&A, research databases, knowledge base systems
  • Expected features:
    • Lightweight vector database (Faiss, ONNX embeddings, or similar)
    • Document ingestion pipeline
    • Chunking strategies for optimal context windows
    • Similarity ranking and relevance scoring
    • Conversation memory with context prefixing

Getting Started

Quick Start: Choose Your Level

Beginner

Start with the basic system:

cd Tiny-Local-LLM-System
./run_app.bat

Then explore the expanded version with multiple personalities.

Intermediate

Jump into multi-agent reasoning:

cd Tiny-Local-Multi-Agent-System
./run_app.bat

Advanced / Professional

Experience the full power of agentic AI with auto-memory:

cd Hyena-AI
uv run python -m app.app
# or use run_app.bat

This is the most advanced system with Claude Code CLI capabilities.

Expert

Wait for Phase 2 releases and build custom systems with knowledge graphs, vector stores, and advanced tool integration.

System Requirements

Minimum (Budget Hardware)

  • CPU: Dual-core 2GHz or better
  • RAM: 4GB (8GB recommended for multi-agent systems)
  • Storage: 2-3GB free space per model
  • OS: Windows (with WSL2), macOS, or Linux

Recommended (Better Performance)

  • CPU: Quad-core 2.5GHz or better
  • RAM: 16GB+ (8GB minimum for Hyena-AI)
  • Storage: 10GB+ for multiple models and memory storage
  • Note: GPU support can be added via llama-cpp-python compilation

Professional Grade (For Hyena-AI)

  • CPU: 6+ cores 3GHz+ for optimal agentic performance
  • RAM: 16GB+ recommended for memory system and tool execution
  • Storage: 20GB+ for models, conversations, and extracted memories
  • Note: Hyena-AI includes sophisticated memory and tool systems

No GPU Required All projects work on CPU-only systems. With a good CPU, inference is surprisingly fast!


Project Progression Map

Phase 1: Foundation (Complete)
│
├─ CodeFlow (Analysis Tool)
│
├─ Tiny Local LLM (Basic Chat)
│  └─ Tiny Local LLM Expanded (Multi-Personality)
│     └─ Tiny Local Multi-Agent System (Expert Orchestration)
│
├─ Hyena (Tool-Based AI Agent System)
│
└─ Hyena-AI (Advanced CLI with Auto-Memory & Agentic Loop)
│
└─────────────────────────────────────────────────────────────
        
Phase 2: Knowledge & Retrieval (In Progress)
│
├─ Tiny Local LLM + Knowledge Graph (Semantic Reasoning)
│
├─ Tiny Local LLM + Vector Store (RAG & Document Q&A)
│
└─ Tiny Local Agent + Tools (Autonomous Execution & Workflows)

Architecture Philosophy

All projects in this ecosystem share common principles:

Dependency Minimalism

  • Only essential packages
  • UV for deterministic dependency resolution
  • Lock files for reproducibility
  • No heavy frameworks (no web server overhead for CLI apps)

Model Inclusivity

  • Uses small, efficient quantized models (1-3B parameters)
  • GGUF format for CPU-friendly inference
  • Models included in repo for offline-first experience
  • Easy model swapping for experimentation

Privacy & Security First

  • Zero cloud dependencies
  • No telemetry or tracking
  • Transparent code you can audit
  • Works completely offline after initial setup

Console-First UI

  • Rich library for beautiful terminal interfaces
  • Low overhead, instant startup
  • Cross-platform compatibility
  • Easy to extend or customize

Extensibility

  • Modular code structure
  • Config files (JSON) for customization without code changes
  • Clear service layers for swapping implementations
  • Well-documented patterns for contributions

Technical Stack

Core Technologies

  • Python 3.10+: Language of choice for LLM work
  • llama-cpp-python: High-performance local LLM inference
  • UV: Lightning-fast package management and virtual environments
  • Rich: Beautiful terminal UI without web server overhead
  • GGUF Models: Quantized models optimized for CPU inference

Coming in Phase 2

  • Vector databases (Faiss, Qdrant, or LanceDB)
  • Graph databases (NetworkX, Neo4j community)
  • Embedding models (ONNX format for CPU)
  • Web frameworks (optional FastAPI for advanced projects)

Comparison to Alternatives

Feature Tiny Local LLM ChatGPT Local LLMs (Generic) Ollama
Cost Free $20+/month Free Free
Privacy 100% Local Cloud 100% Local 100% Local
Offline ✅ Yes ❌ No ✅ Yes ✅ Yes
Customization ✅ High ❌ Low ✅ High ✅ Medium
Multi-Agent ✅ Yes ❌ Native ❌ Complex ❌ Not native
Knowledge Graph 🔄 Coming ❌ No ❌ Custom ❌ No
Vector Store 🔄 Coming ❌ No ❌ Custom ❌ No
Tool Integration 🔄 Coming ✅ Yes ❌ Custom ❌ No
Learning Curve 🟢 Beginner-friendly 🟢 Easy 🟡 Medium 🟡 Medium

Learning Resources

Getting Started with LLMs

Project-Specific Guides

  • Each project includes a detailed README with architecture explanations
  • Architecture diagrams and flow charts in /docs folders
  • Code comments explaining non-obvious design decisions

Community

  • GitHub Discussions on main repository
  • Issues for bug reports and feature requests
  • "Good first issue" labels for newcomers

Project Links

Phase 1: Foundation (Complete)

  1. Tiny Local LLM System - Basic local chat
  2. Tiny Local LLM System Expanded - Multi-personality chat
  3. Tiny Local Multi-Agent System - Expert orchestration

Phase 2: Knowledge & Retrieval (Coming Soon)

  1. Tiny Local LLM with Knowledge Graph - Semantic reasoning
  2. Tiny Local LLM with Vector Store - RAG and document Q&A
  3. Tiny Local Agent with Tools - Function-calling and automation

License

All projects in the Tiny Local LLM ecosystem are licensed under the MIT License. You're free to use, modify, and distribute these projects for personal.


Acknowledgments

This ecosystem builds upon the excellent work of:

  • llama-cpp-python team for efficient CPU-based inference
  • UV creators for revolutionary package management
  • Rich library for beautiful terminal interfaces
  • Hugging Face for democratizing model access
  • The LLM community for pushing open-source AI forward

Special thanks to all contributors and users who have provided feedback and improvements.


Support

Getting Help

  • Documentation: Check project READMEs and /docs folders
  • Bug Reports: File issues on GitHub with reproduction steps
  • Questions: Use GitHub Discussions or community forums
  • Technical Issues: Check troubleshooting sections in project READMEs

Common Issues

  • Model download fails: Check internet connection and disk space
  • Slow inference: Normal on CPU; consider model size vs. hardware tradeoff
  • High memory usage: Reduce context length or batch size
  • Port conflicts: Check for other applications using the same ports

🌟 Star History

If you find this ecosystem helpful, please consider starring the projects! Your support helps us prioritize features and improvements.


Ready to Get Started?

Pick a project based on your experience level:

# Clone the entire ecosystem
git clone https://github.com/yourusername/tiny-local-llm-ecosystem.git
cd tiny-local-llm-ecosystem

# Beginner: Start with the basic system
cd Tiny-Local-LLM-System
./run_app.bat

# Intermediate: Try multi-agent reasoning
cd ../Tiny-Local-Multi-Agent-System
./run_app.bat

# Advanced/Professional: Experience full agentic AI
cd ../Hyena-AI
uv run python -m app.app

Welcome to the future of accessible, private, and powerful local AI! 🎉


Last Updated: February 2026
Version: 1.0 - Phase 1 Complete, Phase 2 In Progress

About

Click here to find out about the Tiny Local LLM Ecosystem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors