April 2026 Open-Source LLM Coding Benchmark — 40 models ranked #5

desiorac · 2026-04-10T15:59:38Z

desiorac
Apr 10, 2026
Maintainer

April 2026 Open-Source LLM Coding Benchmark — Results Published

After running 2,400+ coding tasks across 40 open-source models, I've published the benchmark results.

TL;DR: DeepSeek-Coder-V3 tops accuracy but Qwen2.5-Coder-32B wins on speed/accuracy tradeoff. Yi-Coder-9B-Chat is the biggest surprise — punches well above its weight class.

Key findings

Context window size (128k+) boosts real-world refactoring scores by 12–18% vs synthetic benchmarks
Instruction-tuning quality beats raw model size consistently
Per-language winners differ significantly from overall leaderboard rankings

Links

Free summary micro-site: https://ark-forge.github.io/genesis/
Full article (Gist): https://gist.github.com/desiorac/f1b892deb21689969f33e70d7f2bf4c2
Full 28-page report (€9): https://buy.stripe.com/3cI5kw596ezi7CU7Qk4AU0e

The full report includes per-language breakdowns (Python, JS/TS, Go, Rust, Java), deployment cost analysis, and recommended stacks for 6 use-case profiles.

Happy to answer questions about methodology or specific model comparisons.

desiorac · 2026-04-10T16:10:42Z

desiorac
Apr 10, 2026
Maintainer Author

Update: Just deployed an interactive model selector for this dataset — filter all 44 open-weight models by license, context window, and provider:

https://ark-forge.github.io/genesis/open-llm-selector/

Useful if you're narrowing down candidates for a specific deployment constraint (e.g., Apache-2.0 only, 128k+ context, self-hostable).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

April 2026 Open-Source LLM Coding Benchmark — 40 models ranked #5

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

April 2026 Open-Source LLM Coding Benchmark — 40 models ranked #5

Uh oh!

desiorac Apr 10, 2026 Maintainer

April 2026 Open-Source LLM Coding Benchmark — Results Published

Key findings

Links

Replies: 1 comment

Uh oh!

desiorac Apr 10, 2026 Maintainer Author

desiorac
Apr 10, 2026
Maintainer

desiorac
Apr 10, 2026
Maintainer Author