STEEL

S reamlined
T ensor
E xecution
E ngine
L ibrary

STEEL is a fun experiment writing essentially pure C++ to try to get to the most elegant tensor/ML library I can get while understanding how these are implemented, from the math, to cache aware efficient algorithims, to good ml practices.

Build

cmake -S . -B build
cmake --build build -j$(nproc)

Binaries

build/matrix_bench — low-level matrix multiplication benchmark
build/qwen_infer — interactive Qwen2 inference
build/steel_bench — inference benchmark (prefill / decode / end-to-end)
build/steel_tests — unit tests

Benchmark

./build/steel_bench --model qwen2.5-0.5b-instruct-fp16.gguf --threads 8

Options:

Flag	Default	Description
`--model`	`qwen2.5-0.5b-instruct-fp16.gguf`	Path to GGUF model
`--threads`	auto	CPU threads
`--decode-tokens`	64	Tokens to generate per decode test
`--warmup`	1	Warmup iterations
`--iters`	3	Benchmark iterations (use 5 to match llama-bench)

Reports mean ± stddev and best tok/s for prefill, decode, and end-to-end generation.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.vscode		.vscode
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
bench.cpp		bench.cpp
build.sh		build.sh
dtype.hpp		dtype.hpp
load_gguf.hpp		load_gguf.hpp
matrix.cc		matrix.cc
matrix.hpp		matrix.hpp
ops.hpp		ops.hpp
qwen.hpp		qwen.hpp
reader.hpp		reader.hpp
run.cpp		run.cpp
tensor.hpp		tensor.hpp
test		test
tests.cpp		tests.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

STEEL

Build

Binaries

Benchmark

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

STEEL

Build

Binaries

Benchmark

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages