Skip to content

wangzizhe/GateForge

Repository files navigation

GateForge

CI  Release  License  Python >= 3.10

AI Agent for Physical Systems Modeling

Agentic Modelica Workflow Benchmark

Benchmark snapshot as of May 21, 2026.

All agents use the same foundation model family and are evaluated under the same benchmark and wall-clock conditions.

GateForge outperforms SOTA coding agents, with the strongest margin on medium and hard Modelica workflows.

Agent Total easy medium hard
GateForge 130/132 21/21 56/56 53/55
Claude Code 123/132 21/21 55/56 47/55
OpenCode 120/132 21/21 50/56 49/55

It beat both baselines: executing faster with fewer tokens than OpenCode, and finishing quicker with a higher success rate than Claude Code.

Agent reported tokens* wall time
GateForge ~39.7M ~14,658s
Claude Code ~15.9M ~35,191s
OpenCode ~66.1M ~20,843s

*Reported tokens are runner-reported estimates; GateForge records provider usage directly, while other runners may omit local context management, compression, retries, or tool-output handling costs.

Legal Notice

Without prior written permission, no content on this site may be used for AI model training, fine-tuning, evaluation, or dataset construction.

  • LEGAL_NOTICE.md
  • CONTENT_AUTHORIZATION_POLICY.md
  • robots.txt

About

AI Agents for Physical Systems Modeling

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors