BinaryAI Engine

CPU-optimized LLM inference runtime
Faster than llama.cpp. Written from scratch in C.

Benchmarks

All tests on 12-core/24-thread x86-64 CPU, DDR4-3200 dual-channel. GGUF Q4_K_M format.

Model	Architecture	BinaryAI	llama.cpp	Advantage
Qwen3-8B	Qwen3, 36 layers	12.2 tok/s	10.7 tok/s	+14%
Sherkala-8B	Llama 3.1, 32 layers	12.6 tok/s	11.35 tok/s	+11%
KazLLM-8B	Llama 3.1, 32 layers	11.9 tok/s	10.73 tok/s	+11%

Key Innovations

4 ILP accumulators — Saturate AVX2 pipeline for Q4_K dot product
Fused Residual+RMSNorm — 50% fewer memory passes per layer
JND Perceptual Pruning — 20-40% FFN sparsity at zero quality loss (from EntropyX codec research)
99.7% Delta Sparsity — Path to 100× LM head acceleration
557 KB static binary — No Python, no CUDA, no dependencies

Supported Models

Qwen3 (ChatML)
Qwen2 (ChatML)
Llama 3.1 (Llama 3 template)
Sherkala-8B (Kazakh/English/Russian)
KazLLM-8B (Kazakh/English)
Any GGUF Q4_K_M model

Quick Start

# Download
wget https://github.com/bauratynov/binaryai-releases/releases/latest/download/binaryai-windows-amd64.exe

# Run chat
binaryai.exe -m model.gguf -p "Hello, world!"

# Run OpenAI-compatible server
binaryai.exe -m model.gguf --server --port 8080

Documentation

Technical Whitepaper (PDF)

Download

Download Latest Release →

BinaryAI Engine — Built in Kazakhstan 🇰🇿
_{Combines research from BaiterekLLM, EntropyX, and BaiterekSkip}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BinaryAI Engine

Benchmarks

Key Innovations

Supported Models

Quick Start

Documentation

Download

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

BinaryAI Engine

Benchmarks

Key Innovations

Supported Models

Quick Start

Documentation

Download

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages