GitHub - caozhanhao/slowinfer: A LLM inference engine built from scratch

A pure Rust, built-from-scratch LLM inference engine.

Intro

SlowInfer 🐌 is a from-scratch LLM inference engine written purely in Rust.

This project was built during the Spring Festival just to kill time ~~and some CPU cycles~~.

Features

From Scratch: Tensor, operators, tokenizers, GGUF parser, and model architectures are all implemented from scratch.
PyTorch-like Tensor API: Intuitive interface supporting reshape, view, permute, advanced indexing, broadcasting, ...
Model Support: Includes a minimal implementation of the Qwen3 architecture.
OpenAI-style HTTP API: /v1/chat/completions, /v1/completions, and /v1/models.

Status

🚧 Work in progress: APIs and model coverage are subject to change.

Currently capable of running Qwen3-0.6B-Q8_0.gguf
Execution is CPU-only for now
Many essential features are still under development

Quickstart

Bring a GGUF file (e.g. Qwen3-0.6B-Q8_0.gguf).
Start the server:

cargo run --release --bin server -- --gguf Qwen3-0.6B-Q8_0.gguf --host 127.0.0.1 --port 8765

Send a test request:

curl http://127.0.0.1:8765/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
      "model": "slowinfer-qwen3",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'

Sit back and relax. It's called SlowInfer for a reason.

Roadmap

KV-Cache
High-performance operators
Memory-mapped weight loading
Broader quantization support
More tokenizers, samplers, and model architectures
Maybe rename the project once we hit these milestones 😈

License

SlowInfer is licensed under MIT.
See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Features

Status

Quickstart

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intro

Features

Status

Quickstart

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages