An adoption-first index of open-source AI for the enterprise โ curated through the eyes of a CAIO.
้ขๅ CAIO๏ผ้ฆๅธญ AI ๅฎ / AI ่ด่ดฃไบบ๏ผ็ใๅฏๅผๅ ฅๆงใๅผๆบ็ดขๅผใ Not the coolest projects โ the ones you can actually bring into a company.
๐ caiohome.com ยท ๐ท๏ธ Legend ยท ๐งฑ Starter Stack ยท ๐บ๏ธ Decision Chain ยท ๐๏ธ Contents ยท ๐ค Contribute
The one question this list answers: "Can this open-source project be brought into my company โ to which layer, used by whom, under what license, and at how much compliance risk?"
ๆฌๆธ ๅๅชๅ็ญไธไปถไบ๏ผใ่ฟไธชๅผๆบ้กน็ฎ่ฝไธ่ฝๅผๅ ฅๅ ฌๅธใๅผๅ ฅๅฐๅชไธๅฑใ่ฐๆฅ็จใไปไน่ฎธๅฏ่ฏใๅ่ง้ฃ้ฉๅคๅคงใใ Maintained by CAIOไนๅฎถ ยท caiohome.com โ the home for Chief AI Officers.
Every other awesome list organizes by technical category or research topic, serving engineers and researchers. This one flips the axis: it is organized by the CAIO adoption decision chain, and every entry carries a set of enterprise metadata tags and a direct link to its source. Anyone can copy the entries โ nobody can copy the judgment. That judgment is the only reason this list exists.
๐ Why another list? ยท ไธบไปไนๅๅไธไธชๆธ ๅ
Awesome-LLM, Awesome-LLMOps, awesome-mcp-servers and friends are excellent โ but they answer "what exists?". A CAIO needs to answer "what can I safely ship, and what will legal/security say?" The market is full of 50k-star projects that no enterprise can touch (wrong license, no air-gap, single-vendor lock-in), and 800-star projects that are perfect to adopt.
So here the organizing principle is the introduction decision itself. A ๐ด-license 50k-star repo can be worth less to your company than a ๐ข-license 800-star one. Stars are not an inclusion criterion.
ๅธ้ขไธ็ awesome ๆธ ๅๆๆๆฏ็ฑปๅซ็ป็ป๏ผๆๅกๅทฅ็จๅธไธ็ ็ฉถ่ ใๆฌๆธ ๅๆขไธๆ น่ฝด โโ ๆ CAIO ็ๅผๅ ฅๅณ็ญ้พ ๅๅฑ๏ผๅนถ่ฆๆฑๆฏไธชๆก็ฎๆไธๅฅไผไธๅ ๆฐๆฎ + ไธไธชๅฏ็นๅป็ๆบๅคด้พๆฅใๆๆก็ฎๅฎนๆ๏ผๆไธ่ตฐ่ฟๅฅใๅผๅ ฅๅคๆญใใ
Tags are decision aids, not endorsements. Final adoption rests with your legal, compliance, and security review. ๆ ็ญพๆฏๅคๆญ่พ ๅฉ๏ผไธๆฏ่ไนฆใๆ็ปๅผๅ ฅๅณ็ญไปฅๅ ฌๅธๆณๅกใๅ่งใๅฎๅ จ่ฏๅฎกไธบๅใ
License
๐ข๐ก๐ดand Maturityโญ๐งช๐are required on every entry. Other tags are best-effort. Every project name is a link to its first-party source.
| Axis | Values |
|---|---|
| License ยท ่ฎธๅฏ่ฏ | ๐ข Permissive โ Apache-2.0 / MIT / BSD, commercial use out of the box ยท ๐ก Conditional โ MPL / LGPL / model "community" licenses, commercial with strings (read the terms) ยท ๐ด Restricted โ GPL / AGPL / RAIL / non-commercial / custom, legal review mandatory |
| Maturity ยท ๆ็ๅบฆ | โญ Production โ proven at scale, de-facto standard ยท ๐งช Pilot โ engineering-complete, good for a 30โ90 day trial ยท ๐ Watch โ important direction, still moving, track before adopting |
| Deployment ยท ้จ็ฝฒ | ๐ Self-hostable โ on-prem / air-gap, data never leaves ยท โ๏ธ Cloud-first โ private deploy is costly ยท ๐ฑ Edge โ runs on-device / edge box |
| Origin ยท ๆฅๆบ | ๐จ๐ณ China-led ยท ๐ Global / overseas-led |
| Compliance ยท ๅ่ง | ๐ก๏ธ Xinchuang-ready โ adapted to Ascend / domestic silicon ยท |
โ Inclusion criteria ยท ๆถๅฝๆ ๅ
Included โ must satisfy all:
- Open source with a stated OSS license (and we note the license type).
- Enterprise-introducible โ real production or pilot value; not a demo / toy.
- Maps to a layer of the CAIO decision chain (see Contents).
- Active or de-facto standard โ meaningful update in the last ~6 months, or already the standard in its niche.
- Metadata complete โ at minimum license + maturity.
- First-party first โ official org accounts beat personal forks beat second-hand aggregators.
Not included / footnoted only:
- Closed SaaS (unless it has an OSS core; commercial products appear only as "for comparison" notes).
- Pure academic repros with no engineering, unmaintained and without replacement value.
- Repos with unclear or contradictory licensing (until clarified).
- Hype forks / mirrors / marketing repos.
The figure the original notes referred to, drawn for real. Read it topโbottom: each layer is a place where a CAIO makes an introduce / don't-introduce call. The private/Xinchuang layer underpins everything when data residency is a hard constraint.
flowchart TB
MOD["๐ง ยง01 Foundation<br/>open-weight models"]
TRN["๐๏ธ ยง02โ03 Train ยท Tune<br/>fine-tune ยท RL ยท kernels"]
SRV["โ๏ธ ยง04โ05 Serve ยท Schedule<br/>inference engines ยท orchestration"]
GW["๐ช ยง06โ07 Gateway ยท Context<br/>routing ยท keys ยท token efficiency"]
APP["๐ค ยง08โ10 ยท ยง12 Apps ยท Agents<br/>agents ยท RAG ยท MCP ยท auto-research"]
GOV["๐ก๏ธ ยง11 Govern ยท Observe<br/>eval ยท tracing ยท guardrails"]
INF["๐ ยง13 Private ยท Xinchuang ยท Edge<br/>air-gap ยท Ascend ยท on-device"]
HUB["๐ฐ๏ธ ยง14 Source ยท Host<br/>hubs ยท clouds ยท registries"]
MOD --> TRN --> SRV --> GW --> APP --> GOV
INF -. underpins .-> MOD
INF -. underpins .-> SRV
HUB -. supplies .-> MOD
๐งฑ The list's strongest CAIO opinion: an assembled, coherent open-source stack you could actually pilot โ not 150 disconnected links. Swap freely, but this is a sane default for a regulated/enterprise pilot with a multi-source model policy.
| Layer | Default pick (OSS) | Xinchuang / air-gap swap | Why |
|---|---|---|---|
| Models | Qwen + DeepSeek (+ Llama backup) |
same (all self-hostable) | Multi-source: 2 China-led + 1 global backup |
| Inference | vLLM | vllm-ascend |
Throughput standard; Ascend backend exists |
| Orchestration | Ray Serve / llm-d | Ray Serve | Autoscaling, multi-model, K8s-native |
| Gateway | LiteLLM / One-API | One-API (self-host) | Keys ยท quota ยท billing ยท audit in one place |
| Agents / Apps | LangGraph + Dify | Dify (private) | Controllable, auditable orchestration |
| RAG | RAGFlow + Milvus + BGE + MinerU | same (all ๐ ) | Parsing quality is the real RAG bottleneck |
| Govern | Langfuse + promptfoo + NeMo Guardrails | Langfuse (self-host) | Without this layer, nothing is board-defensible |
| Base | MindSpore / CANN | โ | Domestic training/inference stack |
- ๐ท๏ธ Legend ยท ๐บ๏ธ Decision Chain ยท ๐งฑ Starter Stack
- 01 ยท Foundation Models & Open Weights
- 02 ยท Training / Fine-tuning / Post-training (incl. RL)
- 03 ยท High-performance Kernels & Low-level Systems
- 04 ยท Inference Engines
- 05 ยท Compute Scheduling & Serving Orchestration
- 06 ยท Gateway / Routing / API & Cost Governance
- 07 ยท Context Engineering & Token Efficiency
- 08 ยท Orchestration Frameworks & Agents
- 09 ยท MCP / Tools / Skills
- 10 ยท RAG / Knowledge Base / Data Processing
- 11 ยท Evaluation / Observability / Guardrails / Governance
- 12 ยท Autonomous Research & Scientific Discovery
- 13 ยท Private / Xinchuang / Edge Deployment
- 14 ยท Platforms, Hubs & Registries โ where CAIOs source & host
- 15 ยท Orgs & People to Follow (open-source account index)
- 16 ยท Other Awesome Lists (meta-index)
- ๐ค Contributing ยท โ๏ธ Disclaimer
Each section below gives representative anchor entries that demonstrate the tagging format; full coverage is community-driven (see Contributing).
Adoption logic: fix your base first. A "2 China-led + 1 global/open backup" multi-source policy hedges supply risk. Weight licenses frequently differ from the code license โ check each. ๅผๅ ฅ้ป่พ๏ผๅ ๅฎๅบๅบงๆฅๆบ๏ผๆณจๆๆ้่ฎธๅฏ่ฏๅธธไธไปฃ็ ่ฎธๅฏ่ฏไธๅใ
- DeepSeek (
deepseek-ai) ๐จ๐ณ โญ ๐ข โ V/R-series open weights, mostly MIT. Strong reasoning & code. - Qwen (
QwenLM) ๐จ๐ณ โญ ๐ข โ Full size range + multimodal, mostly Apache-2.0; the enterprise-friendly default. - GLM (
zai-org/THUDM) ๐จ๐ณ โญ ๐ก โ GLM series; some versions carry usage terms โ confirm before commercial use. - Kimi (
moonshotai) ๐จ๐ณ โญ ๐ข โ Long-context & agentic models, open weights. - MiniMax (
MiniMax-AI) ๐จ๐ณ โญ ๐ข โ Open-weight large models with long context. - MiniCPM (
OpenBMB) ๐จ๐ณ โญ ๐ก ๐ฑ โ Edge-friendly small models. - Llama (
meta-llama) ๐ โญ ๐ก โ Llama Community License (includes an MAU-threshold clause); not pure OSS. - Mistral / Gemma / Phi (
mistralai/google/microsoft) ๐ โญ ๐ข๐ก โ overseas open-weight reference points. - gpt-oss (
openai) ๐ โญ ๐ข โ OpenAI's first open-weight models (120B / 20B), Apache-2.0; enterprise-safe license, a Western open-weight anchor. - OLMo (
allenai) ๐ ๐งช ๐ข โ Fully open model (training data + code + weights); the reference when full reproducibility / auditability is the requirement.
โ ๏ธ Verify weight licenses one by one: Apache/MIT are commercial-safe; "community licenses" need legal to read the clauses (commercial caps, naming, acceptable-use).
Adoption logic: 99% of enterprises do not pretrain. The action is in fine-tuning (domain adaptation) and post-training / RL (alignment + reasoning gains). ๅผๅ ฅ้ป่พ๏ผ้็นๆฏๅพฎ่ฐ๏ผ้ขๅ้้ ๏ผไธๅ่ฎญ็ป/RL๏ผๅฏน้ฝไธๆจ็ๅขๅผบ๏ผใ
Distributed pretraining / large-scale training
- Megatron-LM (
NVIDIA) ๐ โญ ๐ข โ A de-facto standard for large-scale parallel training. - DeepSpeed (
deepspeedai) ๐ โญ ๐ข โ ZeRO optimization; memory & throughput. - NeMo (
NVIDIA) ๐ โญ ๐ข โ End-to-end training framework; ๐ก๏ธ evaluate on non-Ascend domestic chips. - ColossalAI (
hpcaitech) ๐งช ๐ข โ Parallel-training toolbox. - TorchTitan (
pytorch) ๐งช ๐ข โ PyTorch-native large-model training reference.
Fine-tuning (most common)
- LLaMA-Factory (
hiyouga) ๐จ๐ณ โญ ๐ข ๐ โ One-stop fine-tuning; the most widely deployed in China. - ms-swift (
modelscope) ๐จ๐ณ โญ ๐ข ๐ ๐ก๏ธ โ ModelScope training/tuning suite; good domestic-ecosystem fit. - Unsloth (
unslothai) ๐ โญ ๐ข โ Efficient single-GPU fine-tuning; saves VRAM. - Axolotl (
axolotl-ai-cloud) ๐ โญ ๐ข โ Config-driven fine-tuning. - TRL (
huggingface) ๐ โญ ๐ข โ HF post-training library (SFT / DPO / GRPO).
RLHF / RL (the reasoning-boost hot zone)
- verl (
volcengine/verl-project) ๐จ๐ณ โญ ๐ข โ ByteDance Seed's production-grade RL framework (HybridFlow). - OpenRLHF (
OpenRLHF) ๐ โญ ๐ข โ Early-popular, approachable RLHF library. - ROLL (
alibaba) ๐จ๐ณ ๐งช ๐ข โ Alibaba large-scale RL framework. - AReaL (
inclusionAI/ Ant) ๐จ๐ณ ๐งช ๐ข โ Asynchronous RL, throughput-focused. - slime (
THUDM) ๐จ๐ณ ๐งช ๐ข โ Zhipu/Tsinghua-lineage RL scaling. - NeMo-RL (
NVIDIA-NeMo) ๐ ๐งช ๐ข โ NVIDIA post-training RL. - DAPO (
BytedTsinghua-SIA) ๐จ๐ณ ๐ ๐ข โ ByteDance ร Tsinghua open RL system/algorithm + dataset (built on verl).
Learn from scratch (capability-building for your seed engineers)
- Karpathy (
karpathy) ๐ โญ ๐ข โ nanoGPT / llm.c / nanochat / micrograd; the best "understand LLMs from zero" teaching code.
Adoption logic: in-house teams need these; almost everyone else only uses them, never edits them. Understanding them is what lets you push inference/training cost down. ๅผๅ ฅ้ป่พ๏ผ็ปๅคงๅคๆฐๅ ฌๅธๅชใ็จใไธใๆนใ๏ผไฝ็่งฃๅฎไปฌๅณๅฎไฝ ่ฝไธ่ฝๅไฝๆๆฌใ
- DeepSeek open-infra-index (
deepseek-ai) ๐จ๐ณ โญ ๐ข โ Index of FlashMLA (MLA decode kernel), DeepEP (MoE comm library), DeepGEMM (FP8 GEMM), DualPipe (pipeline parallel), 3FS (parallel filesystem). Production-validated, mostly MIT. - FlashAttention (
Dao-AILab) ๐ โญ ๐ข โ The attention-acceleration standard. - Triton (
triton-lang) ๐ โญ ๐ข โ Language for writing GPU kernels. - CUTLASS (
NVIDIA) ๐ โญ ๐ข โ CUDA matrix-op template library. - Liger-Kernel (
linkedin) ๐ ๐งช ๐ข โ Fused training kernels; saves VRAM.
Adoption logic: this is the main cost battleground. Engine choice directly sets per-GPU throughput and concurrency cost. ๅผๅ ฅ้ป่พ๏ผ้ๆฌไธปๆๅบ๏ผๅผๆ้ๅ็ดๆฅๅณๅฎๅๅกๅๅไธๅนถๅๆๆฌใ
- vLLM (
vllm-project) ๐ โญ ๐ข ๐ โ High-throughput inference standard (PagedAttention); ๐ก๏ธvllm-ascendAscend fork exists. - SGLang (
sgl-project) ๐ โญ ๐ข ๐ โ High-performance serving; common for RAG / structured output. - TensorRT-LLM (
NVIDIA) ๐ โญ ๐ข โ Peak optimization on NVIDIA GPUs. - LMDeploy (
InternLM) ๐จ๐ณ โญ ๐ข ๐ โ InternLM team's deploy stack; domestic-ecosystem friendly. - llama.cpp (
ggml-org) ๐ โญ ๐ข ๐ ๐ฑ โ CPU / edge / quantized deployment. - Xinference (
xorbitsai) ๐จ๐ณ ๐งช ๐ข ๐ โ Multi-model local inference server.
Adoption logic: once you have multi-GPU / multi-node clusters, you need a layer above the engine for load balancing, autoscaling, and multi-tenancy. ๅผๅ ฅ้ป่พ๏ผๅผๆไนไธ้่ฆ่ฐๅบฆไธ็ผๆๅ่ด่ฝฝๅ่กกใๆฉ็ผฉๅฎนใๅค็งๆทใ
- llm-d (
llm-d) ๐ ๐งช ๐ข ๐ โ vLLM/SGLang orchestration on K8s: smart routing, tiered KV-cache, prefill/decode disaggregation, SLO autoscaling. - llmaz (
InftyAI) ๐ ๐งช ๐ข ๐ โ Lightweight inference platform on K8s. - K8s scheduler-plugins (
kubernetes-sigs) ๐ โญ ๐ข ๐ โ GPU scheduling plugins. - Ray / Ray Serve (
ray-project) ๐ โญ ๐ข ๐ โ Distributed serving & orchestration; elastic, multi-model. - KServe (
kserve) ๐ โญ ๐ข ๐ โ The K8s model-serving standard. - NVIDIA Triton + Dynamo (
triton-inference-server/ai-dynamo) ๐ โญ ๐ข โ Official inference serving, enterprise support. - BentoML / OpenLLM (
bentoml) ๐ โญ ๐ข ๐ โ Packaging & deployment. - AIBrix (
vllm-project) ๐ ๐งช ๐ข ๐ โ Cost-efficient, pluggable K8s infra for LLM inference (ByteDance-origin): LLM-aware autoscaling, KV-cache offload, routing. - LMCache (
LMCache) ๐ ๐งช ๐ข ๐ โ KV-cache layer that accelerates serving via cross-request reuse / offload; pairs with vLLM and llm-d.
Adoption logic: the first piece of enterprise AI infrastructure โ unify multi-source access, keys/quota/billing, audit, rate-limit, fallback. ๅผๅ ฅ้ป่พ๏ผไผไธ็บง AIใ็ฌฌไธๅๅบ็ก่ฎพๆฝใ๏ผ็ปไธๆฅๅ ฅไธๆฒป็ใ
LLM gateways / API aggregation
- LiteLLM (
BerriAI) ๐ โญ ๐ข ๐ โ Widest ecosystem, fastest to integrate, self-hostable. - Portkey Gateway (
Portkey-AI) ๐ โญ ๐ข ๐ โ Routing / caching / guardrails / observability / budget control. - Kong AI Gateway (
Kong) ๐ โญ ๐ข โ AI features on a mature API gateway; first choice if you already run Kong. - Helicone (
Helicone) ๐ ๐งช ๐ข ๐ โ Lightweight gateway/observability, easy to integrate. - One-API (
songquanpeng, ๐ข MIT) / New-API (Calcium-Ion, ๐ด AGPL-3.0) ๐จ๐ณ โญ ๐ โ Multi-tenant gateways with keys/quota/billing/audit; a top self-host choice in China.โ ๏ธ New-API is AGPL โ legal review before SaaS redistribution. - Higress (
higress-group/ Alibaba) ๐จ๐ณ โญ ๐ข ๐ โ AI-native API gateway (Envoy-based) with token rate-limiting, semantic cache and content-safety plugins; China-origin, good Xinchuang/medical-content-safety story.
Model routing (pick a model per prompt โ cut cost, raise quality)
- RouteLLM (
lm-sys) ๐ ๐งช ๐ข โ Strong/weak model routing framework. - Semantic Router (
aurelio-labs) ๐ ๐งช ๐ข โ Routing on semantic embeddings.
โ ๏ธ The gateway is where keys and logs converge โ design security / audit / PII-redaction here; it's the cheapest place to do it.
Adoption logic: cuts the API/inference bill directly. As "Vibe Coding" scales across your dev team, token cost rises โ this layer is explicit ROI. ๅผๅ ฅ้ป่พ๏ผ็ดๆฅ็ API/ๆจ็่ดฆๅ๏ผๆพๆง ROIใ
- rtk (Rust Token Killer) (
rtk-ai) ๐ โญ ๐ข ๐ โ CLI proxy that filters/compresses command output before it enters context; saves 60โ90% tokens; transparent hooks into Claude Code / Codex / Cursor / Gemini CLI. - LLMLingua (
microsoft) ๐ ๐งช ๐ข โ Prompt / context compression. - Repomix (
yamadashy) ๐ โญ ๐ข โ Pack a repo into a single file to feed a model. - files-to-prompt / code2prompt (
simonw/mufeedvh) ๐ ๐งช ๐ข โ Code-context assembly.
Adoption logic: the main vehicle for internal enablement (sales / support / docs / R&D). In regulated/medical settings, prefer controllable, auditable, deterministic frameworks. ๅผๅ ฅ้ป่พ๏ผๅ ้จ่ต่ฝ่ฝๅฐ็ไธป่ฝฝไฝ๏ผๅ่งๅบๆฏไผๅ ๅฏๆงใๅฏๅฎก่ฎกใ็กฎๅฎๆงๅผบ็ๆกๆถใ
- LangGraph (
langchain-ai) ๐ โญ ๐ข ๐ โ Graph-based agent orchestration; production-ready, highly controllable. - LlamaIndex (
run-llama) ๐ โญ ๐ข ๐ โ RAG / agent data framework. - AutoGen (
microsoft) ๐ โญ ๐ข โ Multi-agent orchestration. - DSPy (
stanfordnlp) ๐ ๐งช ๐ข โ Declarative prompt / program optimization. - DeerFlow (
bytedance) ๐จ๐ณ ๐งช ๐ข ๐ โ Deep-research multi-agent; production-base candidate. - Dify (
langgenius) ๐จ๐ณ โญ ๐ข ๐ โ LLM app platform; low-code + self-hostable. - MetaGPT (
FoundationAgents) ๐จ๐ณ โญ ๐ข โ Multi-agent "software team" paradigm. - OpenHands (
All-Hands-AI) ๐ โญ ๐ข ๐ โ Open-source coding agent. - n8n (
n8n-io) ๐ โญ ๐ก ๐ โ Workflow automation (Sustainable Use License โ confirm terms). - CrewAI (
crewAIInc) ๐ โญ ๐ข โ Role-playing multi-agent orchestration; popular for collaborative agent teams. - Pydantic AI (
pydantic) ๐ โญ ๐ข โ Type-safe agent framework; strong fit for controllable, testable enterprise agents. - Google ADK (
google) ๐ โญ ๐ข โ Code-first Agent Development Kit: build, evaluate, deploy. - Agno (
agno-agi) ๐ โญ ๐ข โ High-performance framework to build and run agent platforms. - Bisheng (
dataelement) ๐จ๐ณ โญ ๐ข ๐ โ Enterprise LLM DevOps platform: workflow + RAG + agents + fine-tune + observability, self-hostable.
โ ๏ธ Medical Class II/III: fully dynamic multi-agent systems conflict with NMPA "deterministic, auditable" requirements โ freeze a traceable chain at the orchestration layer.
Adoption logic: the protocol + capability layer that lets agents safely touch enterprise systems. The enterprise focus is gatewayed MCP + audit + permissions. ๅผๅ ฅ้ป่พ๏ผ่ฎฉ Agent ๅฎๅ จๆฅๅ ฅไผไธ็ณป็ป็ๅ่ฎฎๅฑไธ่ฝๅๅฑใ
- Model Context Protocol (
modelcontextprotocol) ๐ โญ ๐ข โ Official protocol + SDKs. - awesome-mcp-servers (
punkpeye) ๐ โ The master index of MCP servers. - awesome-mcp-enterprise (
bh-rat) ๐ โ Enterprise-grade MCP subset. - MCP Gateway (
lasso-security) + mcpo ๐ ๐งช ๐ข ๐ โ MCP gateway / auth / audit. - Composio (
ComposioHQ) ๐ ๐งช ๐ข โ Tool integration. - FastMCP (
PrefectHQ) ๐ โญ ๐ข โ The fast, Pythonic way to build MCP servers and clients. - awesome-claude-skills (
ComposioHQ) ๐ โ Skills asset index.
Adoption logic: the base for internal-knowledge enablement and product RAG. Document-parsing quality is usually the real make-or-break of RAG. ๅผๅ ฅ้ป่พ๏ผๆๆกฃ่งฃๆ่ดจ้ๅพๅพๆฏ RAG ๆ่ดฅ็็ๆญฃ็ถ้ขใ
Vector DB / retrieval
- Milvus (
milvus-io/ Zilliz) ๐จ๐ณ โญ ๐ข ๐ โ Mainstream vector database. - Qdrant (
qdrant) ๐ โญ ๐ข ๐ โ Rust vector DB. - pgvector (
pgvector) ๐ โญ ๐ข ๐ โ Postgres extension; lowest ops overhead. - TurboVec (
RyanCodrai) ๐ ๐งช ๐ข ๐ โ Embedded vector index (FAISS-class, not a server DB) on Google Research's TurboQuant quantizer (ICLR 2026);16ร compression vs float32 (10Mโ4GB), recall on par with FAISS-PQ, modest speed edge (1โ20%). MIT. New & ~single-maintainer โ fit for local/private-RAG PoC; vet maturity before production.
๐งช = emerging/assess. Note: an embedded index (like FAISS/ScaNN), not a server-side vector DB (Milvus/Qdrant/pgvector) โ no distributed/clustering, filtering is id-allowlist only.
Embedding / rerank
- BGE (
FlagOpen/ BAAI) ๐จ๐ณ โญ ๐ข ๐ โ BGE-M3 / BGE-Reranker; first choice for Chinese-language RAG retrieval.
RAG frameworks / app platforms
- RAGFlow (
infiniflow) ๐จ๐ณ โญ ๐ข ๐ โ RAG engine with deep document understanding. - LightRAG (
HKUDS) ๐ ๐งช ๐ข โ Graph-augmented RAG. - GraphRAG (
microsoft) ๐ ๐งช ๐ข โ Knowledge-graph RAG. - FastGPT (
labring) ๐จ๐ณ โญ ๐ข ๐ โ Knowledge-base Q&A platform.
Document parsing (the real bottleneck)
- MinerU (
opendatalab) ๐จ๐ณ โญ ๐ก ๐ โ PDF / layout parsing; the domestic first choice.โ ๏ธ Apache-2.0 + additional terms โ confirm for large-scale commercial use. - Docling (
docling-project/ IBM) ๐ โญ ๐ข ๐ โ Documents โ structured data. - Unstructured (
Unstructured-IO) ๐ โญ ๐ก โ Multi-format parsing. - markitdown (
microsoft) ๐ โญ ๐ข โ Convert Office / PDF / HTML and more into clean Markdown for LLMs.
Web ingestion & memory
- Crawl4AI (
unclecode) ๐ โญ ๐ข โ LLM-friendly web crawler / scraper for RAG data ingestion. - mem0 (
mem0ai) ๐ โญ ๐ข โ Memory layer for agents; persistent user / agent memory across sessions.
Adoption logic: the CAIO's shield. Without this layer, everything above is unauditable and indefensible to the board. ๅผๅ ฅ้ป่พ๏ผCAIO ็ใ็พใใๆฒกๆ่ฟไธๅฑ๏ผๅ้ขๆๆๅผๅ ฅ้ฝไธๅฏๅฎก่ฎกใไธๅฏๅ่ฃไบไผไบคไปฃใ
Evaluation
- lm-evaluation-harness (
EleutherAI) ๐ โญ ๐ข โ The academic eval standard. - OpenCompass (
open-compass) ๐จ๐ณ โญ ๐ข โ Comprehensive China-origin eval. - promptfoo (
promptfoo) ๐ โญ ๐ข ๐ โ Engineering-grade prompt/model eval & red-teaming. - Ragas (
vibrantlabsai) ๐ โญ ๐ข โ RAG evaluation toolkit (faithfulness, answer relevancy, context metrics). - DeepEval (
confident-ai) ๐ โญ ๐ข โ LLM evaluation / unit-testing framework with many built-in metrics.
Observability / tracing
- Langfuse (
langfuse) ๐ โญ ๐ข ๐ โ Open-source LLM observability, self-hostable. - Phoenix (
Arize-ai) ๐ โญ ๐ข ๐ โ Tracing & evaluation. - OpenLLMetry (
traceloop) ๐ ๐งช ๐ข โ OpenTelemetry semantic conventions for LLMs.
Guardrails / safety
- NeMo Guardrails (
NVIDIA) ๐ โญ ๐ข ๐ โ Conversational guardrails. - Guardrails AI (
guardrails-ai) ๐ ๐งช ๐ข โ Output validation. - Llama Guard / PurpleLlama (
meta-llama) ๐ โญ ๐ก โ Safety classification. - garak (
NVIDIA) ๐ ๐งช ๐ข โ LLM vulnerability / red-team scanner. - Presidio (
microsoft) ๐ โญ ๐ข ๐โ ๏ธ โ PII / PHI detection, redaction and anonymization; the gateway-side control for medical / financial data.
Adoption logic: the "research accelerator" for R&D / medical affairs. Mostly ๐ Watch for now โ mind the non-standard licenses and reproducibility. ๅผๅ ฅ้ป่พ๏ผ็ ๅ/ๅปๅญฆไบๅก็ใ็ ็ฉถๅ ้ๅจใ๏ผๅฝๅๅคไธบ่งๅฏๆใ
- AI Scientist / v2 (
SakanaAI) ๐ ๐ ๐ด โ Pioneering end-to-end autonomous research.โ ๏ธ Non-standard license (RAIL-derived, with a mandatory-disclosure clause) โ legal review mandatory before any enterprise use. - AI-Researcher (
HKUDS) ๐ ๐ ๐ข โ HKU Data Intelligence Lab; autonomous research innovation. - open-ai-co-scientist (
llnl) ๐ ๐ ๐ข โ Open reproduction of Google's AI co-scientist multi-agent system. - GPT-Researcher (
assafelovic) ๐ ๐งช ๐ข ๐ โ Autonomous deep-research agent. - DeerFlow (see ยง08) ๐จ๐ณ โ Deep-research orchestration, self-hostable.
Adoption logic: the hard constraints of medical / government / SOE โ data never leaves, domestic substitution, edge/on-device. This layer decides whether everything above can legally land. ๅผๅ ฅ้ป่พ๏ผๆฐๆฎไธๅบๅใๅฝไบงๅๆฟไปฃใ่พน็ผ็ซฏไพง โโ ๅณๅฎๅ้ขๆๆ้กน็ฎ่ฝไธ่ฝๅ่ง่ฝๅฐใ
- awesome-private-ai (
tdi) ๐ โ On-prem / air-gap / self-hosted curated list. - MindSpore / CANN (
mindspore-ai/ Huawei Ascend) ๐จ๐ณ โญ ๐ข ๐ ๐ก๏ธ โ Ascend training/inference stack. - vllm-ascend (
vllm-project) ๐จ๐ณ ๐งช ๐ข ๐ ๐ก๏ธ โ vLLM Ascend backend; the Xinchuang inference path. - LocalAI (
mudler) ๐ โญ ๐ข ๐ ๐ฑ โ OpenAI-compatible local inference. - Jan (
menloresearch) ๐ โญ ๐ข ๐ ๐ฑ โ Offline AI assistant. - Ollama (
ollama) ๐ โญ ๐ข ๐ ๐ฑ โ Local model runtime (MIT; mind trademark & commercial positioning, not the license).
Adoption logic: the repos are only half the story. A CAIO also needs the sourcing & hosting map โ where models actually live, where you pull them behind the firewall, and which managed clouds and domestic silicon you can stand on. ๅผๅ ฅ้ป่พ๏ผ็ฅ้ไปๅบ่ฟไธๅค๏ผ่ฟ่ฆ็ฅ้ใๅปๅชๅๆจกๅใๅจๅชๆ็ฎกใ้ ๅชไธชๅฝไบงๆ ใใ
โ ๏ธ Managed clouds below are listed as sourcing/hosting venues, not as OSS entries โ included because that is where CAIOs source & host in practice.
Model hubs & registries ยท ๆจกๅๆข็บฝ
- Hugging Face ๐ โ The default global hub for models, datasets & Spaces.
- ModelScope ้ญๆญ ๐จ๐ณ โ Alibaba-backed hub; the de-facto China mirror for weights & datasets.
- GitCode AI / ๆจกๅๅนฟๅบ ๐จ๐ณ โ CSDN-backed mirror of OSS & models.
- Kaggle Models ๐ โ Hosted models + notebooks.
- Ollama Library ๐ ๐ฑ โ One-command local model pulls.
Code hosting ยท ไปฃ็ ๆ็ฎก
- GitHub ๐ โ Where most of this list lives.
- Gitee ็ ไบ ๐จ๐ณ โ China's largest host; many domestic OSS mirrors.
- GitLab ๐ โ Self-hostable CE, common behind the firewall.
- AtomGit ๐จ๐ณ ๐ก๏ธ โ OpenAtom Foundation hosting, Xinchuang-aligned.
Managed AI clouds โ overseas ยท ๆตทๅคไบ
- AWS Bedrock / SageMaker ๐ โ๏ธ
- Azure AI Foundry ๐ โ๏ธ
- Google Vertex AI ๐ โ๏ธ
- NVIDIA NGC / NIM ๐ โ๏ธ ๐ โ Containers, models, microservices.
Managed AI clouds โ China ยท ๅฝๅ ไบ
- Alibaba Cloud Model Studio ็พ็ผ ๐จ๐ณ โ๏ธ
- Volcengine Ark ็ซๅฑฑๆน่ ๐จ๐ณ โ๏ธ
- Baidu Qianfan ๅๅธ ๐จ๐ณ โ๏ธ
- Tencent Cloud TI / Hunyuan ่ พ่ฎฏไบ ๐จ๐ณ โ๏ธ
- Huawei Cloud ModelArts ๅไธบไบ ๐จ๐ณ โ๏ธ ๐ก๏ธ
Inference-as-a-service / API aggregators ยท ๆจ็ๅณๆๅก
- OpenRouter ๐ โ Many models behind one key.
- Together / Fireworks / Groq / DeepInfra ๐ โ Hosted open-weight inference at speed.
- SiliconFlow ็ก ๅบๆตๅจ ๐จ๐ณ โ Low-cost hosted open-weight inference.
Domestic compute stacks ยท ๅฝไบง็ฎๅๆ (ไฟกๅ)
- Huawei Ascend CANN ๐จ๐ณ ๐ก๏ธ โ The Ascend NPU compute architecture (the CUDA analog for Xinchuang).
- MindSpore ๐จ๐ณ ๐ก๏ธ โ Huawei's AI framework.
- Cambricon ๅฏๆญฆ็บช ยท Moore Threads ๆฉๅฐ็บฟ็จ ยท Hygon ๆตทๅ ยท Biren ๅฃไป ๐จ๐ณ ๐ก๏ธ โ Domestic accelerators to evaluate for Xinchuang procurement.
Adoption logic: follow the source, not the repo. Watching these official org accounts gets you the signal earlier than chasing single repos. Prioritize org accounts; keep personal accounts minimal. ๅผๅ ฅ้ป่พ๏ผ่ทไปๅบไธๅฆ่ทใๆบๅคดใใ
Models & low-level โ China: deepseek-ai ยท QwenLM ยท zai-org (GLM) ยท OpenBMB ยท ByteDanceSeed ยท MoonshotAI (Kimi) ยท MiniMax-AI ยท FlagOpen (BAAI) ยท InternLM ยท modelscope
Models & low-level โ global: NVIDIA ยท microsoft ยท meta-llama ยท mistralai ยท huggingface ยท google-research
Inference / infrastructure: vllm-project ยท sgl-project ยท ray-project ยท InftyAI ยท llm-d ยท BerriAI (LiteLLM)
Agents / apps / RAG: langchain-ai ยท run-llama ยท langgenius (Dify) ยท infiniflow (RAGFlow) ยท HKUDS ยท opendatalab
Protocol / tools: modelcontextprotocol ยท ComposioHQ
People (OSS maintainers, follow as needed): karpathy (understand LLMs from zero) โ otherwise track via the org accounts above.
This list doesn't reinvent the wheel. Below are deeper, domain-specific lists โ use them as drill-down entry points. ๆฌๆธ ๅไธ้ๅค้ ่ฝฎๅญ๏ผไปฅไธๆฏๅ็ปๅ้ขๅๆดๆทฑ็ไธ้จๆธ ๅใ
- tensorchord/Awesome-LLMOps โ Full-stack LLMOps; best-structured.
- Not-Diamond/awesome-ai-model-routing โ Model-routing focus.
- xlite-dev/Awesome-LLM-Inference โ Inference acceleration: papers + code.
- deepseek-ai/open-infra-index โ DeepSeek's official infra index.
- Hannibal046/Awesome-LLM โ The classic master LLM list.
- EthicalML/awesome-production-machine-learning โ Production ML / governance, the elder list.
- punkpeye/awesome-mcp-servers ยท bh-rat/awesome-mcp-enterprise โ MCP ecosystem.
- Shubhamsaboo/awesome-llm-apps ยท e2b-dev/awesome-ai-agents โ Apps & agents.
PRs welcome โ see CONTRIBUTING.md. The short version:
- One project per PR, in the right section, sorted by maturity then name.
- License + maturity tags are mandatory; add deployment/origin/compliance tags where you can.
- Every entry must link to its first-party source.
- Maintainer review before merge: link validity, license matches the code's own declaration, activity in the last ~6 months.
- One sentence on which layer it fits and what problem it solves โ no marketing fluff.
- Closed/commercial products don't go in the body; if there's an OSS core, link the core repo.
๐ค A GitHub Action checks every link on each PR, and the
scripts/audit.pyhelper cross-checks stars + license against the GitHub API.
This list is a decision aid, not legal, compliance, or procurement advice. Final license, compliance, and security judgments rest with your company's legal, compliance, and security teams. Tags can go stale as projects evolve โ always confirm against the project's current upstream state before adoption.
ๆฌๆธ ๅไธบๅผๅ ฅๅณ็ญ็่พ ๅฉๅ่๏ผไธๆๆๆณๅพใๅ่งๆ้่ดญๅปบ่ฎฎใๅผๅ ฅๅ่ฏทไปฅ้กน็ฎๅฎๆนไปๅบ็ๅฝๅ็ถๆไธบๅใ
Maintained with โ by CAIOไนๅฎถ ยท caiohome.com โ the home for Chief AI Officers. Content licensed under CC BY 4.0. Attribution: "Awesome CAIO โ caiohome.com".
If this saved you one bad procurement decision, give it a โญ and pass it to your AI lead.