RAGfish is the reference implementation of the Noema Architecture — an AI runtime design that strictly separates decision, execution, and knowledge.
The project implements the architectural principle:
AI is not the subject. AI is a tool executed under human‑controlled policy.
This repository contains the architectural documentation and implementation artifacts for the Noema runtime model.
The system is divided into three core layers:
| Layer | Responsibility |
|---|---|
| Client Layer | Decision making and routing |
| Execution Layer | Constrained task execution |
| Knowledge Layer | Storage and retrieval of knowledge |
These layers are separated by explicit boundaries so that no single component can both decide and execute.
The following diagram describes the overall architecture.
Key principles:
- Decision authority exists only in the client runtime
- Execution agents are stateless and constrained
- Knowledge stores contain no behavior
- Routing decisions are explicit and auditable
The PolicyEngine determines where a query should be executed.
Routing decisions are based on runtime policy signals such as:
- Tool or external execution requirements
- Privacy sensitivity
- Latency preferences
Example outcomes:
localLLM → execute locally using llama.cpp
remoteAgent → execute via noema-agent
The router itself performs no decision logic. It only maps policy results to execution routes.
The following sequence diagram illustrates how a request moves through the system.
Execution flow:
- User submits query
- Client UI forwards request to ExecutionCoordinator
- PolicyEngine evaluates execution conditions
- Router selects execution route
- Request executes locally or via remote agent
- Execution layer retrieves references from RAGpack store
- Response is returned to the client
RAGfish/
├ docs/
│ ├ diagrams/
│ │ architecture-standalone.puml
│ │ architecture-standalone.png
│ │ routing-decision.puml
│ │ execution-sequence.puml
│ │
│ └ assets/
│ routing-policy-decision.png
│ execution-runtime-sequence.png
│
├ NoesisNoema/
│ Swift client runtime
│
├ noema-agent/
│ Constrained execution runtime
│
└ noesisnoema-pipeline/
RAGpack generation tools
MinimalClientView
ExecutionCoordinator
PolicyEngine
Router
LocalExecutor
AgentExecutor
AgentClient
Responsibilities:
- Evaluate policy
- Route execution
- Manage human interaction
The client runtime is the only decision authority.
The invocation boundary prevents execution layers from influencing decision logic.
AgentClient → noema-agent API
This boundary ensures that the execution environment cannot modify routing decisions.
ConstraintEngine
TaskExecutor
Responsibilities:
- Validate requests
- Execute tasks
- Remain stateless
The execution layer has no decision authority.
RAGpack Store
Responsibilities:
- Store chunked knowledge
- Provide retrieval references
The knowledge layer has no behavior.
Two execution paths currently exist:
| Route | Description |
|---|---|
| localLLM | Local inference via llama.cpp |
| remoteAgent | Execution through noema-agent |
Routing decisions are made by the PolicyEngine and mapped by the Router.
The Noema architecture follows several strict rules:
- AI cannot decide. Only policy code may decide.
- Execution must be constrained. Agents cannot autonomously act.
- Knowledge must be passive. Knowledge systems do not contain logic.
- Routing must be explicit. All execution paths are observable.
These rules prevent uncontrolled agent behavior and keep the system deterministic.
Typical development flow:
git checkout main
git checkout -b feature/epicX
# implement
git commit
git push
# open PR
The project is currently implementing:
EPIC4 — Routing & Hybrid Execution
This phase introduces:
- Policy-based routing
- Local vs remote execution
- Hybrid runtime orchestration
Planned improvements include:
- Advanced routing policies
- Cost-aware routing
- Privacy-aware execution
- Model selection policies
TBD


