⚡ This repo is built for local workstation or cloud-native deployment. It includes everything you need to run a private RAG pipeline with Ollama + OpenWebUI for local model hosting. Secure, composable, and deliciously overengineered.
Semblance RAG is a modular Retrieval-Augmented Generation (RAG) platform designed to support knowledge indexing, semantic search, and language model augmentation workflows. It leverages containerized microservices to ensure a flexible and extensible architecture for AI-enabled applications.
This project is composed of the following core services, managed via Docker Compose:
- Provides REST endpoints for query handling and health checks.
- Integrates with Weaviate and language model APIs (e.g. OpenAI).
- Contains business logic, orchestration routines, and potential user interface integration.
- Suitable for administrative interfaces or broader platform integration.
- Vector search engine used for semantic retrieval.
- Utilizes the
text2vec-openaimodule for text embedding.
- Placeholder for a future browser-based UI.
- May include chat interface or RAG analytics dashboard.
- Middleware layer for forwarding or managing API access to OpenAI or other model providers.
- Can be used for logging, rate limiting, or environment abstraction.
- Handles log ingestion and routing to Elasticsearch.
- Can also support document preprocessing and indexing.
- Indexing and search backend used for storing structured logs or documents.
- Works in tandem with Logstash and Kibana.
- Web-based dashboard for monitoring and exploring data stored in Elasticsearch.
- Docker and Docker Compose
- Git
- OpenAI API Key (if using OpenAI-based embeddings or completion)
git clone git@github.com:eooo-io/semblance-rag.git
cd semblance-rag| Service | URL |
|---|---|
| FastAPI Docs | http://localhost:8000/docs |
| Laravel App | http://localhost:9000 |
| Kibana | http://localhost:5601 |
Send a POST request to /query:
{
"query": "What is Semblance?",
"top_k": 5
}Expected response includes top-ranked document chunks and associated metadata.
Planned features and enhancements include:
- Support for local GPU-backed model inference via Ollama and HuggingFace models.
- Integration of a full ingestion pipeline with semantic chunking.
- Document upload and management via the Laravel backend.
- Role-based access control for secure deployments.
- Scalable deployment strategy (e.g., Docker Swarm or Kubernetes).
This project is in active development. Currently NOT looking for contributions.
This repository is licensed under the MIT License. See LICENSE for details.