Semantic Gravity Well: 16 LLMs tested on hydrogen balloon inertia — perfect reasoning, wrong answers. 4,225 API calls reveal systematic semantic bias in physics reasoning. 语义重力井:推理全对答案全错,氢气球实验揭示大模型系统性物理偏见
nlp machine-learning gemini gpt ai-safety claude cognitive-bias llm prompt-engineering chain-of-thought qwen llm-evaluation deepseek llm-benchmark semantic-bias physics-reasoning
-
Updated
Apr 20, 2026 - Python