SA-RAG: Semantic-Accelerated Retrieval Augmentation Framework

Next-Generation Semantic Retrieval Operating System

High Performance · Multi-Stage Retrieval · Graph-RAG · Memory-Enhanced · Self-Evolving

📖 Table of Contents

Project Overview
Core Features
System Architecture
Project Structure
Quick Start
Core Components
SharpAIKit
SA-AgentOS
API Documentation
Testing
Performance Comparison
Contributing
License

🎯 Project Overview

SA-RAG is an enterprise-grade Semantic Retrieval Augmented Generation (RAG) framework built with a Rust + Python + C# three-tier architecture, providing:

🚀 High-Performance Retrieval Engine: Rust-based core retrieval system
🧠 Intelligent Orchestration Layer: Python implementation for LLM integration and query analysis
🤖 Agent System: C# Agent framework based on SharpAIKit
📊 Multi-Stage Retrieval: Hybrid retrieval combining Vector + BM25 + Graph + Memory
🔄 Self-Evolving Capabilities: Learning ranker, consistency checking, autonomous debugging

Design Philosophy

SA-RAG is not just a RAG framework, but a Semantic Retrieval Operating System with:

Adaptive: Automatically optimizes retrieval strategies based on data
Explainable: Complete execution graph tracking and visualization
Extensible: Plugin-based architecture supporting custom components
Self-Optimizing: Automatically learns optimal ranking weights
Learnable: Continuously improves retrieval quality from data

✨ Core Features

🔥 Foundation Capabilities

Feature	Description	Implementation
Semantic Node Engine	Semantic node segmentation engine	Rust
Hybrid Retrieval	Vector + BM25 hybrid retrieval	Rust
Graph-RAG	Knowledge graph-enhanced retrieval	Rust
Multi-Stage Ranking	Multi-stage ranking fusion	Rust
Memory Store	Long-term memory management	Rust
Diff Index	Incremental index updates	Rust

🚀 Next-Generation Capabilities

Feature	Description	Implementation
Self-Evolving Ranker	Self-evolving ranker (RL + contrastive learning)	Rust + Python
Execution Graph	Semantic execution graph visualization	Rust + Python
Cognitive Memory	Three-tier memory system (short-term/long-term/semantic consolidation)	Rust + Python
Multimodal Nodes	Multimodal node support (images/tables/code/formulas)	Rust
Feature Store	High-dimensional feature storage (versioned + TTL)	Rust
Autonomous Debugger	Autonomous retrieval debugger	Rust + Python
Consistency Checker	Retrieval consistency validator	Rust + Python
Plugin System	Plugin-based extensible architecture	Rust + Python

🏗️ System Architecture

Complete System Architecture Diagram

┌─────────────────────────────────────────────────────────────────┐
│                    SA-AgentOS (C# Agent Layer)                   │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐                │
│  │   Agent    │  │  Planner   │  │   Memory   │                │
│  │  Pipeline  │  │            │  │    Tool    │                │
│  └─────┬──────┘  └─────┬──────┘  └─────┬──────┘                │
│        │                │                │                        │
│        └────────────────┴────────────────┘                        │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │   SharpAIKit Tools  │                              │
│              │  ┌──────────────┐  │                              │
│              │  │  Knowledge   │  │                              │
│              │  │  Retrieval   │  │                              │
│              │  └──────┬───────┘  │                              │
│              │  ┌──────┴───────┐  │                              │
│              │  │   DeepSeek   │  │                              │
│              │  │  Reasoning   │  │                              │
│              │  └──────────────┘  │                              │
└──────────────┼────────────────────┼──────────────────────────────┘
               │ HTTP/gRPC          │
               ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│              Python Orchestration Layer                          │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │  Query   │  │ Structured│  │ Embedding│  │   LLM    │       │
│  │ Analyzer │  │  Query    │  │ Generator│  │ Service  │       │
│  └────┬─────┘  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘     │
│       │              │               │               │            │
│       └──────────────┴───────────────┴───────────────┘            │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │   RAG Pipeline      │                              │
│              │   (Orchestrator)    │                              │
└──────────────┼─────────────────────┼──────────────────────────────┘
               │ PyO3 / FFI          │
               ▼                     ▼
┌─────────────────────────────────────────────────────────────────┐
│              Rust Core Engine (High Performance)                 │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Semantic │  │  Vector  │  │   BM25   │  │  Graph   │       │
│  │  Node    │  │  Index   │  │  Index   │  │   RAG    │       │
│  └────┬─────┘  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘     │
│       │              │               │               │            │
│       └──────────────┴───────────────┴───────────────┘            │
│                         │                                         │
│              ┌──────────┴──────────┐                              │
│              │  Multi-Stage        │                              │
│              │  Ranking Engine     │                              │
│              └─────────────────────┘                              │
│                                                                   │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐       │
│  │ Learning │  │Execution │  │  Feature │  │  Memory  │       │
│  │ Ranker   │  │  Graph   │  │  Store   │  │  System  │       │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘       │
└─────────────────────────────────────────────────────────────────┘

📁 Project Structure

SA-RAG/
├── SA_RAG/                          # Core RAG Framework
│   ├── src/                         # Rust Core Engine
│   │   ├── lib.rs                   # Main entry (PyO3 bindings)
│   │   ├── semantic_node.rs         # Semantic node engine
│   │   ├── parser.rs                # Document parser
│   │   ├── indexer/                 # Indexer modules
│   │   │   ├── vector.rs            # HNSW vector index
│   │   │   ├── bm25.rs              # BM25 full-text index
│   │   │   ├── hybrid.rs            # Hybrid retrieval engine
│   │   │   └── multi_stage.rs       # Multi-stage ranking
│   │   ├── graph/                   # Graph retrieval module
│   │   │   ├── graph.rs             # Knowledge graph
│   │   │   └── expansion.rs         # Graph expansion algorithms
│   │   ├── memory.rs                # Memory store
│   │   ├── diff.rs                  # Incremental indexing
│   │   ├── learning_ranker/         # Self-evolving ranker
│   │   ├── execution_graph/         # Execution graph
│   │   ├── feature_store/           # Feature store
│   │   ├── consistency/             # Consistency checker
│   │   ├── debugger/                # Autonomous debugger
│   │   └── plugins/                 # Plugin system
│   ├── python/                      # Python Orchestration Layer
│   │   ├── sa_rag/
│   │   │   ├── __init__.py          # RAG main class
│   │   │   ├── client.py            # Client API
│   │   │   ├── rag.py               # RAG pipeline
│   │   │   ├── orchestrator.py      # Orchestrator
│   │   │   ├── query_analysis.py    # Query analysis
│   │   │   ├── llm.py               # LLM service
│   │   │   ├── embedding.py         # Embedding generation
│   │   │   ├── api_server.py        # FastAPI server
│   │   │   └── [next-gen modules]   # Next-generation feature modules
│   │   └── pyproject.toml
│   ├── examples/                    # Example code
│   ├── tests/                       # Test suite
│   └── Cargo.toml
│
├── sa_agentos/                      # Agent System
│   ├── agent_csharp/                # C# Agent Project
│   │   └── SaAgentOS/
│   │       ├── Program.cs           # Main program
│   │       ├── Agents/              # Agent implementation
│   │       ├── Tools/               # Tool collection
│   │       ├── Pipelines/           # Pipeline orchestration
│   │       └── Config/              # Configuration files
│   └── python_server/               # Python HTTP Service
│       ├── server.py                # FastAPI server
│       └── tests/                   # Tests
│
├── SharpAIKit/                      # C# AI Framework (Dependency)
│   ├── src/SharpAIKit/              # Core library
│   │   ├── Agent/                   # Agent framework
│   │   ├── LLM/                     # LLM client
│   │   ├── Memory/                  # Memory management
│   │   └── [other modules]
│   └── samples/                     # Sample projects
│
└── docs/                            # Documentation
    ├── architecture.md              # Architecture documentation
    ├── api_reference.md             # API reference
    └── NEXT_GEN_FEATURES.md         # Next-generation features documentation

🚀 Quick Start

Prerequisites

Rust: 1.70+ (for compiling core engine)
Python: 3.10+ (recommended to use uv package manager)
.NET: 8.0+ (for Agent system)
DeepSeek API Key: for LLM reasoning

1. Install Rust Core Engine

cd SA_RAG
cargo build --release

2. Install Python Package

cd SA_RAG/python

# Using uv (recommended)
uv sync

# Or using pip
pip install -e .

3. Run Tests

cd SA_RAG/tests
uv run pytest test_framework_validation.py -v

4. Start Python API Server

cd SA_RAG/python
uv run python -m sa_rag.api_server

5. Start SA-AgentOS

# 1. Start Python RAG service
cd sa_agentos/python_server
uv run python server.py

# 2. Configure DeepSeek API Key (in appsettings.json)
# 3. Run C# Agent
cd sa_agentos/agent_csharp/SaAgentOS
dotnet run

🔧 Core Components

SA-RAG Core

Rust Core Engine

Semantic Node Engine: Intelligent document segmentation
Hybrid Index: Vector + BM25 hybrid index
Graph Engine: Knowledge graph construction and querying
Multi-Stage Ranker: Multi-stage ranking fusion
Memory Store: Long-term memory management
Diff Index: Incremental index updates

Python Orchestration Layer

Query Analyzer: LLM-driven query analysis
Structured Query Builder: Structured query generation
Orchestrator: Retrieval strategy orchestration
Embedding Generator: Embedding vector generation
RAG Pipeline: End-to-end RAG workflow

SA-AgentOS

C# Agent System

SaRagAgent: Core Agent class (based on SharpAIKit)
KnowledgeRetrievalTool: Knowledge retrieval tool
DeepSeekReasoningTool: DeepSeek reasoning tool
MemoryTool: Memory management tool
SaAgentPipeline: Agent execution pipeline

Python HTTP Service

FastAPI Server: Provides REST API
RAG Query Endpoint: /rag/query
Memory Management: /rag/memory
Health Check: /health

🔧 SharpAIKit: C# AI Framework

SharpAIKit is a unified .NET large-model application and agentic AI development framework that powers the SA-AgentOS system. It provides:

Key Features

✅ Unified LLM Interface: One API for all OpenAI-compatible models (OpenAI, DeepSeek, Qwen, Mistral, etc.)
✅ LCEL-style Chains: Elegant chain composition with pipe operators
✅ Multiple Memory Strategies: Buffer, Window, Summary, Vector, and Entity memory
✅ Advanced Agents: ReAct, Plan-and-Execute, and Multi-Agent systems
✅ Native C# Code Interpreter: Execute C# code directly using Roslyn, no Python needed
✅ SharpGraph: Graph-based orchestration with FSM, supports loops and complex branches
✅ DSPy-style Optimizer: Automatically optimize prompts through iterative improvement

Integration with SA-RAG

SA-AgentOS leverages SharpAIKit's powerful Agent framework:

Tool System: All SA-RAG tools inherit from SharpAIKit.Agent.ToolBase
Agent Framework: SaRagAgent uses SharpAIKit.Agent.AiAgent for tool orchestration
LLM Client: Uses SharpAIKit.LLM.LLMClientFactory for DeepSeek integration
Memory Management: Uses SharpAIKit.Memory.BufferMemory for conversation history

For more information about SharpAIKit, see: SharpAIKit/README.md

🔧 SharpAIKit: C# AI Framework

SharpAIKit is a unified .NET large-model application and agentic AI development framework that powers the SA-AgentOS system. It's the .NET equivalent of LangChain, but with killer features that leverage the .NET ecosystem.

Key Features

✅ Unified LLM Interface: One API for all OpenAI-compatible models (OpenAI, DeepSeek, Qwen, Mistral, Yi, Groq, Moonshot, Ollama, etc.)
✅ LCEL-style Chains: Elegant chain composition with pipe operators
✅ Multiple Memory Strategies: Buffer, Window, Summary, Vector, and Entity memory
✅ Advanced Agents: ReAct, Plan-and-Execute, and Multi-Agent systems
✅ Native C# Code Interpreter: Execute C# code directly using Roslyn, no Python needed 🔮
✅ SharpGraph: Graph-based orchestration with FSM, supports loops and complex branches 🕸️
✅ DSPy-style Optimizer: Automatically optimize prompts through iterative improvement 🧬

Killer Features Beyond LangChain

🔮 Native C# Code Interpreter: Execute C# code directly using Roslyn - no Python dependency, blazing fast!
🕸️ SharpGraph: Graph orchestration with loops and complex branches - more powerful than LangGraph
🧬 DSPy Optimizer: Auto-optimize prompts, gets smarter over time

Integration with SA-RAG

SA-AgentOS leverages SharpAIKit's powerful Agent framework:

Tool System: All SA-RAG tools inherit from SharpAIKit.Agent.ToolBase
- Tools are automatically discovered via [Tool] and [Parameter] attributes
- Seamless integration with Agent's tool calling mechanism
Agent Framework: SaRagAgent uses SharpAIKit.Agent.AiAgent for tool orchestration
- Automatic tool discovery and execution
- Built-in conversation history management
LLM Client: Uses SharpAIKit.LLM.LLMClientFactory for DeepSeek integration
- Unified interface for all OpenAI-compatible APIs
- Streaming support
Memory Management: Uses SharpAIKit.Memory.BufferMemory for conversation history
- Automatic context management
- Configurable history length

Quick Example

using SharpAIKit.LLM;

// Create client for any OpenAI-compatible API
var client = LLMClientFactory.Create(
    apiKey: "your-api-key",
    baseUrl: "https://api.deepseek.com/v1",
    model: "deepseek-chat"
);

// Simple chat
var response = await client.ChatAsync("Hello!");

// Streaming output
await foreach (var chunk in client.ChatStreamAsync("Tell me a story"))
{
    Console.Write(chunk);
}

Supported LLM Providers

Provider	Base URL	Preset Method
OpenAI	`https://api.openai.com/v1`	`CreateOpenAI()`
DeepSeek	`https://api.deepseek.com/v1`	`CreateDeepSeek()`
Qwen (Alibaba)	`https://dashscope.aliyuncs.com/compatible-mode/v1`	`CreateQwen()`
Mistral	`https://api.mistral.ai/v1`	`CreateMistral()`
Yi (01.AI)	`https://api.lingyiwanwu.com/v1`	`CreateYi()`
Groq	`https://api.groq.com/openai/v1`	`CreateGroq()`
Moonshot (Kimi)	`https://api.moonshot.cn/v1`	`CreateMoonshot()`
Ollama (Local)	`http://localhost:11434`	`CreateOllama()`
Any OpenAI-compatible	Custom	`Create(key, url, model)`

For more information about SharpAIKit, see: SharpAIKit/README.md

🤖 SA-AgentOS: Intelligent Agent System

SA-AgentOS is an intelligent agent system built on top of SA-RAG Core and SharpAIKit, providing:

Core Capabilities

✅ Planning: Intelligent decision-making about when to retrieve knowledge
✅ World Model (Memory): Long-term memory management
✅ Knowledge Retrieval: Calls SA-RAG for multi-stage retrieval
✅ Execution Graph: Visualizes Agent reasoning process
✅ Self-Correction: Built-in debugging and reflection capabilities
✅ Multi-Turn Dialogue: Supports context-aware conversations

Usage Example

# Start Agent
cd sa_agentos/agent_csharp/SaAgentOS
dotnet run

# Interactive dialogue
> Hello, I'm John
> What's my name?  # Agent remembers: Your name is John
> Please help me find information about Python
> clear  # Clear conversation history
> exit   # Exit

For detailed documentation, see: sa_agentos/README.md

📊 Performance Comparison

Feature	SA-RAG	LangChain	LlamaIndex	Haystack	Chroma	Weaviate
Performance
Rust Core Engine	✅	❌	❌	❌	✅	✅
Native Speed	✅	❌	❌	❌	✅	✅
Core Features
Semantic Node Parsing	✅	⚠️	✅	⚠️	❌	❌
Multi-stage Retrieval	✅	⚠️	✅	✅	❌	❌
Graph-RAG	✅	⚠️	✅	❌	❌	⚠️
Differential Indexing	✅	❌	❌	❌	❌	❌
Long-term Memory	✅	⚠️	⚠️	⚠️	❌	❌
Hybrid Retrieval	✅	✅	✅	✅	⚠️	⚠️
Next-Gen Features
Self-Evolving Ranker	✅	❌	❌	❌	❌	❌
Execution Graph	✅	❌	❌	❌	❌	❌
Cognitive Memory System	✅	❌	❌	❌	❌	❌
Multimodal Support	✅	⚠️	⚠️	⚠️	❌	⚠️
Feature Store	✅	❌	❌	❌	❌	❌
Autonomous Debugger	✅	❌	❌	❌	❌	❌
Consistency Checker	✅	❌	❌	❌	❌	❌
Plugin System	✅	⚠️	⚠️	⚠️	❌	❌
Architecture
Python + Rust Hybrid	✅	❌	❌	❌	❌	❌
PyO3 Integration	✅	❌	❌	❌	❌	❌
Extensible Plugin API	✅	⚠️	⚠️	⚠️	❌	❌
Performance Metrics
Indexing Speed	~1000 docs/s	~100 docs/s	~200 docs/s	~150 docs/s	~500 docs/s	~800 docs/s
Retrieval Latency	<10ms	~50ms	~30ms	~40ms	~20ms	~15ms
Memory Efficiency	High	Medium	Medium	Medium	High	High

Legend:

✅ Full support
⚠️ Partial support or requires additional setup
❌ Not supported

Key Differentiators

Performance: Rust core engine provides native-speed operations, 5-10x faster than pure Python implementations
Self-Learning: Unique self-evolving ranker learns optimal retrieval strategies from your data
Explainability: Execution graph visualization shows exactly how queries are processed
Cognitive Memory: Three-tier memory system (short-term/long-term/semantic) with automatic consolidation
Autonomous Debugging: Automatically identifies and suggests fixes for retrieval issues
Consistency: Built-in drift detection ensures reproducible and reliable retrieval
Extensibility: Comprehensive plugin system for custom rankers, parsers, and policies
Multimodal-Ready: Native support for images, tables, code blocks, and formulas

🧪 Testing

Run All Tests

cd SA_RAG/tests
uv run pytest test_framework_validation.py -v

Test Coverage

✅ Framework initialization tests
✅ Document indexing tests
✅ Search functionality tests
✅ Q&A functionality tests
✅ Memory management tests
✅ Document update tests
✅ Next-generation feature tests
✅ Error handling tests
✅ Performance tests
✅ Integration tests

📚 API Documentation

Python API

from sa_rag import RAG

# Initialize
rag = RAG(llm_provider="deepseek", embedding_provider="deepseek")

# Index documents
rag.index_documents(["Document 1", "Document 2"])

# Search
results = rag.search("query", top_k=5, use_graph=True, use_memory=True)

# Q&A
answer = rag.ask("question", top_k=5, use_graph=True, use_memory=True)

# Add memory
rag.add_memory("Important information", importance=0.8)

REST API

# Query
curl -X POST http://localhost:8000/rag/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is Python?",
    "use_graph": true,
    "use_memory": true,
    "top_k": 6
  }'

# Add memory
curl -X POST "http://localhost:8000/rag/memory?text=User likes Python&importance=0.8"

For detailed API documentation, see: docs/api_reference.md

🔍 Architecture Documentation

🛠️ Development Guide

Build Rust Core

cd SA_RAG
cargo build --release

Develop Python Package

cd SA_RAG/python
uv sync --dev
uv run pytest

Develop C# Agent

cd sa_agentos/agent_csharp/SaAgentOS
dotnet build
dotnet test

📈 Roadmap

Completed ✅

Planned 🚧

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

🙏 Acknowledgments

SharpAIKit: C# AI framework support
PyO3: Rust-Python bindings
FastAPI: Python web framework
DeepSeek: LLM service provider

SA-RAG - Next-Generation Semantic Retrieval Operating System

⭐ If this project helps you, please give it a Star!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
SA_RAG		SA_RAG
SharpAIKit		SharpAIKit
docs		docs
sa_agentos		sa_agentos
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
README_EN.md		README_EN.md

License

dxpython/SA-RAG

Folders and files

Latest commit

History

Repository files navigation

SA-RAG: Semantic-Accelerated Retrieval Augmentation Framework

📖 Table of Contents

🎯 Project Overview

Design Philosophy

✨ Core Features

🔥 Foundation Capabilities

🚀 Next-Generation Capabilities

🏗️ System Architecture

📁 Project Structure

🚀 Quick Start

Prerequisites

1. Install Rust Core Engine

2. Install Python Package

3. Run Tests

4. Start Python API Server

5. Start SA-AgentOS

🔧 Core Components

SA-RAG Core

Rust Core Engine

Python Orchestration Layer

SA-AgentOS

C# Agent System

Python HTTP Service

🔧 SharpAIKit: C# AI Framework

Key Features

Integration with SA-RAG

🔧 SharpAIKit: C# AI Framework

Key Features

Killer Features Beyond LangChain

Integration with SA-RAG

Quick Example

Supported LLM Providers

🤖 SA-AgentOS: Intelligent Agent System

Core Capabilities

Usage Example

📊 Performance Comparison

Key Differentiators

🧪 Testing

Run All Tests

Test Coverage

📚 API Documentation

Python API

REST API

🔍 Architecture Documentation

🛠️ Development Guide

Build Rust Core

Develop Python Package

Develop C# Agent

📈 Roadmap

Completed ✅

Planned 🚧

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages