Author: John Mitchell (@whmatrix) Status: ACTIVE Audience: Protocol Implementers / Engineers Environment: No code to run (specification only) Fast Path: Read the deliverable contract below
This is the authoritative specification for RAG deliverables, audit contracts, and quality gates.
This repository is the "start here" surface for how I operate under Universal Protocol v4.23:
- Deliverable contract (RAG-ready indexing outputs)
- Audit contract (RAG readiness evaluation)
- Quality gates and verification primitives
- Portfolio artifacts (PDF examples)
- Evidence artifacts are first-class: manifests, validation gates, and audit outputs.
- Explicit pass/fail thresholds for chunk integrity, alignment, and error states.
- Standardized deliverable structure for client handoff.
See: portfolio_artifacts/Semantic_Indexing_Output__Example_Structure.pdf
Deliverables (canonical):
chunks.jsonlmetadata.jsonl(line-aligned with chunks)vectors.index(FAISS)summary.json(manifest + stats + quality gates)
Examples:
- portfolio_artifacts/Portfolio_01_RAG_Readiness_Audit.pdf
- portfolio_artifacts/RAG_Readiness_Audit__Sample_Report.pdf
- Large-scale indexing example: portfolio_artifacts/Portfolio_03_Large_Scale_Semantic_Indexing.pdf
- RAG-ready indexing example: portfolio_artifacts/Portfolio_02_RAG_Ready_Semantic_Indexing.pdf
| Role | Repository | Status |
|---|---|---|
| Production | semantic-indexing-batch-02 | Active (8.3M+ vectors) |
| Foundational | semantic-indexing-batch-01 | Superseded |
This repository is licensed under Apache License 2.0 (not MIT).
Why Apache-2.0 for the protocol spec?
- Patent grant clause: Apache-2.0 includes explicit patent grants, protecting anyone who implements this protocol
- Industry standard: API and protocol specifications commonly use Apache-2.0 (OpenAPI, gRPC, Protobuf)
- Derivative implementations: If you build a pipeline that follows this protocol, the patent grant covers your implementation
Other whmatrix repositories use MIT (simpler, permissive). The choice is intentional:
- MIT for implementation code (tools, scripts, indices)
- Apache-2.0 for this specification (patent protection for implementers)
This protocol defines structure and deliverable contracts, not retrieval quality guarantees. Compliance with this protocol does not imply fitness for any specific application. Retrieval quality, relevance, and domain suitability require independent evaluation.