Skip to content

dxpython/SA-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SA-RAG: Semantic-Accelerated Retrieval Augmentation Framework

SA-RAG Logo

Next-Generation Semantic Retrieval Operating System

.NET Python Rust License

High Performance · Multi-Stage Retrieval · Graph-RAG · Memory-Enhanced · Self-Evolving

English | 中文


📖 Table of Contents


🎯 Project Overview

SA-RAG is an enterprise-grade Semantic Retrieval Augmented Generation (RAG) framework built with a Rust + Python + C# three-tier architecture, providing:

  • 🚀 High-Performance Retrieval Engine: Rust-based core retrieval system
  • 🧠 Intelligent Orchestration Layer: Python implementation for LLM integration and query analysis
  • 🤖 Agent System: C# Agent framework based on SharpAIKit
  • 📊 Multi-Stage Retrieval: Hybrid retrieval combining Vector + BM25 + Graph + Memory
  • 🔄 Self-Evolving Capabilities: Learning ranker, consistency checking, autonomous debugging

Design Philosophy

SA-RAG is not just a RAG framework, but a Semantic Retrieval Operating System with:

  • Adaptive: Automatically optimizes retrieval strategies based on data
  • Explainable: Complete execution graph tracking and visualization
  • Extensible: Plugin-based architecture supporting custom components
  • Self-Optimizing: Automatically learns optimal ranking weights
  • Learnable: Continuously improves retrieval quality from data

✨ Core Features

🔥 Foundation Capabilities

Feature Description Implementation
Semantic Node Engine Semantic node segmentation engine Rust
Hybrid Retrieval Vector + BM25 hybrid retrieval Rust
Graph-RAG Knowledge graph-enhanced retrieval Rust
Multi-Stage Ranking Multi-stage ranking fusion Rust
Memory Store Long-term memory management Rust
Diff Index Incremental index updates Rust

🚀 Next-Generation Capabilities

Feature Description Implementation
Self-Evolving Ranker Self-evolving ranker (RL + contrastive learning) Rust + Python
Execution Graph Semantic execution graph visualization Rust + Python
Cognitive Memory Three-tier memory system (short-term/long-term/semantic consolidation) Rust + Python
Multimodal Nodes Multimodal node support (images/tables/code/formulas) Rust
Feature Store High-dimensional feature storage (versioned + TTL) Rust
Autonomous Debugger Autonomous retrieval debugger Rust + Python
Consistency Checker Retrieval consistency validator Rust + Python
Plugin System Plugin-based extensible architecture Rust + Python

🏗️ System Architecture

SA-RAG System Overview

Complete System Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                    SA-AgentOS (C# Agent Layer)                   │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                │
│  │   Agent    │  │  Planner   │  │   Memory   │                │
│  │  Pipeline  │  │            │  │    Tool    │                │
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘                │
│        │                │                │                        │
│        └────────────────┴────────────────┘                        │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │   SharpAIKit Tools  │                              │
│              │  ┌──────────────┐  │                              │
│              │  │  Knowledge   │  │                              │
│              │  │  Retrieval   │  │                              │
│              │  └──────┬───────┘  │                              │
│              │  ┌──────┴───────┐  │                              │
│              │  │   DeepSeek   │  │                              │
│              │  │  Reasoning   │  │                              │
│              │  └──────────────┘  │                              │
└──────────────┼────────────────────┼──────────────────────────────┘
               │ HTTP/gRPC          │
               ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│              Python Orchestration Layer                          │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │  Query   │  │ Structured│  │ Embedding│  │   LLM    │       │
│  │ Analyzer │  │  Query    │  │ Generator│  │ Service  │       │
│  └────┬─────┘  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘     │
│       │              │               │               │            │
│       └──────────────┴───────────────┴───────────────┘            │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │   RAG Pipeline      │                              │
│              │   (Orchestrator)    │                              │
└──────────────┼─────────────────────┼──────────────────────────────┘
               │ PyO3 / FFI          │
               ▼                     ▼
┌─────────────────────────────────────────────────────────────────┐
│              Rust Core Engine (High Performance)                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Semantic │  │  Vector  │  │   BM25   │  │  Graph   │       │
│  │  Node    │  │  Index   │  │  Index   │  │   RAG    │       │
│  └────┬─────┘  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘     │
│       │              │               │               │            │
│       └──────────────┴───────────────┴───────────────┘            │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │  Multi-Stage        │                              │
│              │  Ranking Engine     │                              │
│              └─────────────────────┘                              │
│                                                                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Learning │  │Execution │  │  Feature │  │  Memory  │       │
│  │ Ranker   │  │  Graph   │  │  Store   │  │  System  │       │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘       │
└─────────────────────────────────────────────────────────────────┘

📁 Project Structure

SA-RAG/
├── SA_RAG/                          # Core RAG Framework
│   ├── src/                         # Rust Core Engine
│   │   ├── lib.rs                   # Main entry (PyO3 bindings)
│   │   ├── semantic_node.rs         # Semantic node engine
│   │   ├── parser.rs                # Document parser
│   │   ├── indexer/                 # Indexer modules
│   │   │   ├── vector.rs            # HNSW vector index
│   │   │   ├── bm25.rs              # BM25 full-text index
│   │   │   ├── hybrid.rs            # Hybrid retrieval engine
│   │   │   └── multi_stage.rs       # Multi-stage ranking
│   │   ├── graph/                   # Graph retrieval module
│   │   │   ├── graph.rs             # Knowledge graph
│   │   │   └── expansion.rs         # Graph expansion algorithms
│   │   ├── memory.rs                # Memory store
│   │   ├── diff.rs                  # Incremental indexing
│   │   ├── learning_ranker/         # Self-evolving ranker
│   │   ├── execution_graph/         # Execution graph
│   │   ├── feature_store/           # Feature store
│   │   ├── consistency/             # Consistency checker
│   │   ├── debugger/                # Autonomous debugger
│   │   └── plugins/                 # Plugin system
│   ├── python/                      # Python Orchestration Layer
│   │   ├── sa_rag/
│   │   │   ├── __init__.py          # RAG main class
│   │   │   ├── client.py            # Client API
│   │   │   ├── rag.py               # RAG pipeline
│   │   │   ├── orchestrator.py      # Orchestrator
│   │   │   ├── query_analysis.py    # Query analysis
│   │   │   ├── llm.py               # LLM service
│   │   │   ├── embedding.py         # Embedding generation
│   │   │   ├── api_server.py        # FastAPI server
│   │   │   └── [next-gen modules]   # Next-generation feature modules
│   │   └── pyproject.toml
│   ├── examples/                    # Example code
│   ├── tests/                       # Test suite
│   └── Cargo.toml
│
├── sa_agentos/                      # Agent System
│   ├── agent_csharp/                # C# Agent Project
│   │   └── SaAgentOS/
│   │       ├── Program.cs           # Main program
│   │       ├── Agents/              # Agent implementation
│   │       ├── Tools/               # Tool collection
│   │       ├── Pipelines/           # Pipeline orchestration
│   │       └── Config/              # Configuration files
│   └── python_server/               # Python HTTP Service
│       ├── server.py                # FastAPI server
│       └── tests/                   # Tests
│
├── SharpAIKit/                      # C# AI Framework (Dependency)
│   ├── src/SharpAIKit/              # Core library
│   │   ├── Agent/                   # Agent framework
│   │   ├── LLM/                     # LLM client
│   │   ├── Memory/                  # Memory management
│   │   └── [other modules]
│   └── samples/                     # Sample projects
│
└── docs/                            # Documentation
    ├── architecture.md              # Architecture documentation
    ├── api_reference.md             # API reference
    └── NEXT_GEN_FEATURES.md         # Next-generation features documentation

🚀 Quick Start

Prerequisites

  • Rust: 1.70+ (for compiling core engine)
  • Python: 3.10+ (recommended to use uv package manager)
  • .NET: 8.0+ (for Agent system)
  • DeepSeek API Key: for LLM reasoning

1. Install Rust Core Engine

cd SA_RAG
cargo build --release

2. Install Python Package

cd SA_RAG/python

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .

3. Run Tests

cd SA_RAG/tests
uv run pytest test_framework_validation.py -v

4. Start Python API Server

cd SA_RAG/python
uv run python -m sa_rag.api_server

5. Start SA-AgentOS

# 1. Start Python RAG service
cd sa_agentos/python_server
uv run python server.py

# 2. Configure DeepSeek API Key (in appsettings.json)
# 3. Run C# Agent
cd sa_agentos/agent_csharp/SaAgentOS
dotnet run

🔧 Core Components

SA-RAG Core

Rust Core Engine

  • Semantic Node Engine: Intelligent document segmentation
  • Hybrid Index: Vector + BM25 hybrid index
  • Graph Engine: Knowledge graph construction and querying
  • Multi-Stage Ranker: Multi-stage ranking fusion
  • Memory Store: Long-term memory management
  • Diff Index: Incremental index updates

Python Orchestration Layer

  • Query Analyzer: LLM-driven query analysis
  • Structured Query Builder: Structured query generation
  • Orchestrator: Retrieval strategy orchestration
  • Embedding Generator: Embedding vector generation
  • RAG Pipeline: End-to-end RAG workflow

SA-AgentOS

C# Agent System

  • SaRagAgent: Core Agent class (based on SharpAIKit)
  • KnowledgeRetrievalTool: Knowledge retrieval tool
  • DeepSeekReasoningTool: DeepSeek reasoning tool
  • MemoryTool: Memory management tool
  • SaAgentPipeline: Agent execution pipeline

Python HTTP Service

  • FastAPI Server: Provides REST API
  • RAG Query Endpoint: /rag/query
  • Memory Management: /rag/memory
  • Health Check: /health

🔧 SharpAIKit: C# AI Framework

SharpAIKit is a unified .NET large-model application and agentic AI development framework that powers the SA-AgentOS system. It provides:

Key Features

  • Unified LLM Interface: One API for all OpenAI-compatible models (OpenAI, DeepSeek, Qwen, Mistral, etc.)
  • LCEL-style Chains: Elegant chain composition with pipe operators
  • Multiple Memory Strategies: Buffer, Window, Summary, Vector, and Entity memory
  • Advanced Agents: ReAct, Plan-and-Execute, and Multi-Agent systems
  • Native C# Code Interpreter: Execute C# code directly using Roslyn, no Python needed
  • SharpGraph: Graph-based orchestration with FSM, supports loops and complex branches
  • DSPy-style Optimizer: Automatically optimize prompts through iterative improvement

Integration with SA-RAG

SA-AgentOS leverages SharpAIKit's powerful Agent framework:

  • Tool System: All SA-RAG tools inherit from SharpAIKit.Agent.ToolBase
  • Agent Framework: SaRagAgent uses SharpAIKit.Agent.AiAgent for tool orchestration
  • LLM Client: Uses SharpAIKit.LLM.LLMClientFactory for DeepSeek integration
  • Memory Management: Uses SharpAIKit.Memory.BufferMemory for conversation history

For more information about SharpAIKit, see: SharpAIKit/README.md


🔧 SharpAIKit: C# AI Framework

SharpAIKit is a unified .NET large-model application and agentic AI development framework that powers the SA-AgentOS system. It's the .NET equivalent of LangChain, but with killer features that leverage the .NET ecosystem.

Key Features

  • Unified LLM Interface: One API for all OpenAI-compatible models (OpenAI, DeepSeek, Qwen, Mistral, Yi, Groq, Moonshot, Ollama, etc.)
  • LCEL-style Chains: Elegant chain composition with pipe operators
  • Multiple Memory Strategies: Buffer, Window, Summary, Vector, and Entity memory
  • Advanced Agents: ReAct, Plan-and-Execute, and Multi-Agent systems
  • Native C# Code Interpreter: Execute C# code directly using Roslyn, no Python needed 🔮
  • SharpGraph: Graph-based orchestration with FSM, supports loops and complex branches 🕸️
  • DSPy-style Optimizer: Automatically optimize prompts through iterative improvement 🧬

Killer Features Beyond LangChain

  1. 🔮 Native C# Code Interpreter: Execute C# code directly using Roslyn - no Python dependency, blazing fast!
  2. 🕸️ SharpGraph: Graph orchestration with loops and complex branches - more powerful than LangGraph
  3. 🧬 DSPy Optimizer: Auto-optimize prompts, gets smarter over time

Integration with SA-RAG

SA-AgentOS leverages SharpAIKit's powerful Agent framework:

  • Tool System: All SA-RAG tools inherit from SharpAIKit.Agent.ToolBase
    • Tools are automatically discovered via [Tool] and [Parameter] attributes
    • Seamless integration with Agent's tool calling mechanism
  • Agent Framework: SaRagAgent uses SharpAIKit.Agent.AiAgent for tool orchestration
    • Automatic tool discovery and execution
    • Built-in conversation history management
  • LLM Client: Uses SharpAIKit.LLM.LLMClientFactory for DeepSeek integration
    • Unified interface for all OpenAI-compatible APIs
    • Streaming support
  • Memory Management: Uses SharpAIKit.Memory.BufferMemory for conversation history
    • Automatic context management
    • Configurable history length

Quick Example

using SharpAIKit.LLM;

// Create client for any OpenAI-compatible API
var client = LLMClientFactory.Create(
    apiKey: "your-api-key",
    baseUrl: "https://api.deepseek.com/v1",
    model: "deepseek-chat"
);

// Simple chat
var response = await client.ChatAsync("Hello!");

// Streaming output
await foreach (var chunk in client.ChatStreamAsync("Tell me a story"))
{
    Console.Write(chunk);
}

Supported LLM Providers

Provider Base URL Preset Method
OpenAI https://api.openai.com/v1 CreateOpenAI()
DeepSeek https://api.deepseek.com/v1 CreateDeepSeek()
Qwen (Alibaba) https://dashscope.aliyuncs.com/compatible-mode/v1 CreateQwen()
Mistral https://api.mistral.ai/v1 CreateMistral()
Yi (01.AI) https://api.lingyiwanwu.com/v1 CreateYi()
Groq https://api.groq.com/openai/v1 CreateGroq()
Moonshot (Kimi) https://api.moonshot.cn/v1 CreateMoonshot()
Ollama (Local) http://localhost:11434 CreateOllama()
Any OpenAI-compatible Custom Create(key, url, model)

For more information about SharpAIKit, see: SharpAIKit/README.md


🤖 SA-AgentOS: Intelligent Agent System

SA-AgentOS is an intelligent agent system built on top of SA-RAG Core and SharpAIKit, providing:

Core Capabilities

  • Planning: Intelligent decision-making about when to retrieve knowledge
  • World Model (Memory): Long-term memory management
  • Knowledge Retrieval: Calls SA-RAG for multi-stage retrieval
  • Execution Graph: Visualizes Agent reasoning process
  • Self-Correction: Built-in debugging and reflection capabilities
  • Multi-Turn Dialogue: Supports context-aware conversations

Usage Example

# Start Agent
cd sa_agentos/agent_csharp/SaAgentOS
dotnet run

# Interactive dialogue
> Hello, I'm John
> What's my name?  # Agent remembers: Your name is John
> Please help me find information about Python
> clear  # Clear conversation history
> exit   # Exit

For detailed documentation, see: sa_agentos/README.md


📊 Performance Comparison

Feature SA-RAG LangChain LlamaIndex Haystack Chroma Weaviate
Performance
Rust Core Engine
Native Speed
Core Features
Semantic Node Parsing ⚠️ ⚠️
Multi-stage Retrieval ⚠️
Graph-RAG ⚠️ ⚠️
Differential Indexing
Long-term Memory ⚠️ ⚠️ ⚠️
Hybrid Retrieval ⚠️ ⚠️
Next-Gen Features
Self-Evolving Ranker
Execution Graph
Cognitive Memory System
Multimodal Support ⚠️ ⚠️ ⚠️ ⚠️
Feature Store
Autonomous Debugger
Consistency Checker
Plugin System ⚠️ ⚠️ ⚠️
Architecture
Python + Rust Hybrid
PyO3 Integration
Extensible Plugin API ⚠️ ⚠️ ⚠️
Performance Metrics
Indexing Speed ~1000 docs/s ~100 docs/s ~200 docs/s ~150 docs/s ~500 docs/s ~800 docs/s
Retrieval Latency <10ms ~50ms ~30ms ~40ms ~20ms ~15ms
Memory Efficiency High Medium Medium Medium High High

Legend:

  • ✅ Full support
  • ⚠️ Partial support or requires additional setup
  • ❌ Not supported

Key Differentiators

  1. Performance: Rust core engine provides native-speed operations, 5-10x faster than pure Python implementations
  2. Self-Learning: Unique self-evolving ranker learns optimal retrieval strategies from your data
  3. Explainability: Execution graph visualization shows exactly how queries are processed
  4. Cognitive Memory: Three-tier memory system (short-term/long-term/semantic) with automatic consolidation
  5. Autonomous Debugging: Automatically identifies and suggests fixes for retrieval issues
  6. Consistency: Built-in drift detection ensures reproducible and reliable retrieval
  7. Extensibility: Comprehensive plugin system for custom rankers, parsers, and policies
  8. Multimodal-Ready: Native support for images, tables, code blocks, and formulas

🧪 Testing

Run All Tests

cd SA_RAG/tests
uv run pytest test_framework_validation.py -v

Test Coverage

  • ✅ Framework initialization tests
  • ✅ Document indexing tests
  • ✅ Search functionality tests
  • ✅ Q&A functionality tests
  • ✅ Memory management tests
  • ✅ Document update tests
  • ✅ Next-generation feature tests
  • ✅ Error handling tests
  • ✅ Performance tests
  • ✅ Integration tests

📚 API Documentation

Python API

from sa_rag import RAG

# Initialize
rag = RAG(llm_provider="deepseek", embedding_provider="deepseek")

# Index documents
rag.index_documents(["Document 1", "Document 2"])

# Search
results = rag.search("query", top_k=5, use_graph=True, use_memory=True)

# Q&A
answer = rag.ask("question", top_k=5, use_graph=True, use_memory=True)

# Add memory
rag.add_memory("Important information", importance=0.8)

REST API

# Query
curl -X POST http://localhost:8000/rag/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is Python?",
    "use_graph": true,
    "use_memory": true,
    "top_k": 6
  }'

# Add memory
curl -X POST "http://localhost:8000/rag/memory?text=User likes Python&importance=0.8"

For detailed API documentation, see: docs/api_reference.md


🔍 Architecture Documentation


🛠️ Development Guide

Build Rust Core

cd SA_RAG
cargo build --release

Develop Python Package

cd SA_RAG/python
uv sync --dev
uv run pytest

Develop C# Agent

cd sa_agentos/agent_csharp/SaAgentOS
dotnet build
dotnet test

📈 Roadmap

Completed ✅

  • Rust core engine
  • Python orchestration layer
  • Multi-stage retrieval
  • Graph-RAG
  • Memory management
  • Incremental indexing
  • Next-generation features (8 major modules)
  • SA-AgentOS Agent system
  • Conversation history support

Planned 🚧

  • Web UI console
  • Distributed deployment support
  • More LLM provider support
  • Performance optimization
  • More examples and tutorials

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.


🙏 Acknowledgments

  • SharpAIKit: C# AI framework support
  • PyO3: Rust-Python bindings
  • FastAPI: Python web framework
  • DeepSeek: LLM service provider

SA-RAG - Next-Generation Semantic Retrieval Operating System

⭐ If this project helps you, please give it a Star!

About

Semantic-Accelerated Retrieval Augmentation Framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published