๐ฏ Democratizing AI Access with Intelligent Routing
A Python-based model routing system that intelligently selects appropriate local LLMs for different types of queries. Features advanced multilingual support, conversation memory, and OpenAI-enhanced routing for superior query analysis and optimization.
AI Society is an advanced model routing system that combines dual AI intelligence - using OpenAI's superior query analysis with efficient local model execution. It features conversation memory for extended interactions and multilingual support for global accessibility.
- ๐ Multilingual Intelligence - Automatic language detection and translation for optimal performance
- ๐ง Dual AI Architecture - OpenAI meta-routing + Local model execution
- ๐ง Query Optimization - Automatically enhances queries for dramatically better results
- ๐ฌ Conversation Memory - Multi-turn conversations with hybrid FAISS indexing
- ๐ฏ Smart Model Selection - AI-powered routing to specialized models
- โก Performance Tracking - Comprehensive monitoring and analytics
- Python 3.8+
- Ollama installed and running
- GPU with 8GB+ VRAM (tested on RTX 3090)
- OpenAI API key (optional, for enhanced routing)
# Clone and setup
git clone https://github.com/dexmac221/AiSociety.git
cd AiSociety
# Automated setup
chmod +x setup.sh && ./setup.sh
# Quick start
chmod +x start.sh && ./start.sh- Web Interface: http://localhost:8000
- WebSocket API: ws://localhost:8000/ws
- REST API: http://localhost:8000/api/health
๐ฏ System is LIVE and fully operational!
Latest Features:
- โ 14 cutting-edge 2025 models integrated
- โ Enhanced UI with dark mode and 8+ example categories
- โ Multilingual support with OpenAI translation framework
- โ Hybrid memory system with conversation context
- โ Real-time technical dashboard with performance metrics
- Qwen2.5-Coder:7B - Advanced multilingual coding with debugging
- DeepSeek-Coder-v2:16B - Complex algorithms and system programming
- CodeLlama:7B - General coding, documentation, refactoring
- Phi-4:14B - Microsoft's latest math reasoning model
- Qwen2.5:7B - Algebra, calculus, statistics, problem solving
- Phi3:mini - Quick calculations and basic math
- Hermes-4:14B - NousResearch's latest uncensored creative model
- Yi:9B - Long-form content, poetry, fiction
- Neural-Chat:7B - Dialogue, conversation, roleplay
- Qwen2.5-Omni:7B - Real-time voice, text, image, audio, video
- Gemma-3:27B/4B - Google's latest multimodal models
- Gemma-3:1B - Ultra-efficient edge deployment
- Llama3.1:8B - Meta's latest reasoning and code model
- Mistral:7B - Advanced reasoning and function calling
- Query Reception - User sends message in any supported language
- Language Detection - OpenAI automatically detects query language
- Translation Layer - Non-English queries translated for optimal performance
- Memory Integration - System builds context from conversation history
- OpenAI Analysis - GPT-4.1-mini analyzes and optimizes query
- Model Selection - AI recommends optimal local model
- Local Execution - Enhanced query runs on selected model
- Response Enhancement - Results include optimization details and context
- Universal Language Support - Spanish, French, German, Italian, Portuguese, Japanese, Chinese, and more
- Intelligent Translation - OpenAI detects language and translates for optimal comprehension
- Native Response Language - Models receive instructions to respond in original language
- Real-time Indicators - Language panel shows detection and translation status
- Multi-turn Conversations - "Write a function" โ "Explain that code" โ "Make it more efficient"
- Context Awareness - Remembers previous messages and maintains flow
- Smart References - Understands "that code", "the previous example"
- Hybrid Architecture - FAISS indexing with OpenAI summarization
| Original Query | OpenAI Enhancement |
|---|---|
| "sort list" | "Write a well-documented Python function with error handling..." |
| "quantum" | "Explain quantum computing in simple terms with examples..." |
| "5+3*2" | "Calculate step-by-step showing order of operations..." |
# Set API key for enhanced routing
export OPENAI_API_KEY="your-api-key-here"{
"max_model_size": "8GB",
"openai_meta_routing": {
"enabled": true,
"model": "gpt-4.1-mini",
"cache_decisions": true
},
"specialization_weights": {
"coding": 1.5,
"math": 1.3,
"creative": 1.2
}
}# Run comprehensive tests
./test_system.py
# Test specific components
python test_multilingual.py
python test_conversation_memory.py
python test_query_optimization.py๐ค "Debug this Python code: def fibonacci(n): return n + fibonacci(n-1)"
๐ง Enhanced: "Analyze and debug this recursive Python function..."
๐ค qwen2.5-coder โ Identifies missing base case and infinite recursion
๐ค "Write a Python sorting function"
๐ค [Provides function] ๐ง 2 messages
๐ค "Explain how that works"
๐ค [Explains previous function] ๐ง 4 messages
๐ค "Make it more efficient"
๐ค [Improves with optimizations] ๐ง 6 messages
AiSociety/
โโโ src/
โ โโโ daemon/ # Model discovery
โ โโโ memory/ # Conversation memory
โ โโโ routing/ # Intelligent routing
โโโ web/
โ โโโ app.py # FastAPI web interface
โโโ config/ # Configuration files
โโโ docs/ # Documentation
โโโ requirements.txt # Dependencies
โโโ setup.sh # Setup script
โโโ start.sh # Start script
See CONTRIBUTING.md for guidelines on:
- Reporting issues and feature requests
- Development setup and workflow
- Code style and testing requirements
- Pull request process
- DEVELOPMENT.md - Development guide and architecture
- OPENAI_META_ROUTING.md - Technical deep dive on meta-routing
- CHANGELOG.md - Version history and updates
- SECURITY.md - Security policy and reporting
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama team for excellent local LLM infrastructure
- OpenAI for API integration capabilities
- FastAPI for the robust web framework
- FAISS for efficient vector similarity search
- GitHub Issues: Report bugs and request features
- Discussions: Community discussions and questions

