fix: 修复退出程序后异常重启#640
Conversation
Summary by CodeRabbit发布说明
总体概述本 PR 改进了三个独立系统的资源管理和关闭流程:系统托盘线程现采用托管式生命周期而非分离式创建,日志系统明确定义了异步接收器的关闭顺序(移除 → 停止 → 刷新 → 重置),Windows 平台通过原子化存储和 atexit 处理程序确保正确的进程退出码传递。 变更应用生命周期和关闭流程改进
预计代码审查工作量🎯 3 (中等) | ⏱️ ~25 分钟 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/main.cpp`:
- Around line 140-147: The atexit handler registered early uses TerminateProcess
with g_final_exit_code which is only set just before normal return, causing
early non-zero returns to be masked as zero; fix by removing or delaying the
early atexit registration and ensure g_final_exit_code is set (and std::atomic
synchronized) before any return path, or refactor main into a single exit point
that updates g_final_exit_code then calls the termination helper; also replace
TerminateProcess(GetCurrentProcess(), ...) with ExitProcess(...) in the final
termination handler to avoid skipping DLL_PROCESS_DETACH and reduce resource
leak risk.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: de6e6662-95fa-416c-81bd-f62c6b325d8a
📒 Files selected for processing (3)
src/logging.cppsrc/main.cppsrc/system_tray.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,c,h}
⚙️ CodeRabbit configuration file
src/**/*.{cpp,c,h}: Sunshine 核心 C++ 源码,自托管游戏串流服务器。审查要点:内存安全、 线程安全、RAII 资源管理、安全漏洞。注意预处理宏控制的平台相关代码。
Files:
src/main.cppsrc/logging.cppsrc/system_tray.cpp
🪛 Clang (14.0.6)
src/main.cpp
[error] 6-6: 'atomic' file not found
(clang-diagnostic-error)
🔇 Additional comments (3)
src/system_tray.cpp (2)
1235-1245: 正确修复:避免静态std::thread在析构时std::terminate。显式
join()取代之前的"分离后无人管理"模式,是关闭流程修复重启问题的关键一环。值得注意几点:
join()无超时。若tray_loop因第三方 tray 库实现原因未在tray_exit()后及时返回,此处会无限阻塞主线程的关闭路径。当前依赖main.cpp中的 atexit/TerminateProcess作为兜底;如果未来去掉该兜底,建议改用带超时的std::future+detach()或在另一线程中执行 join 并设超时。- 若有路径让 tray 线程自身的菜单回调间接触发
end_tray(),会触发std::system_error(resource_deadlock_would_occur)—— 已经被 try/catch 覆盖,行为安全。
1402-1421: 线程生命周期管理改善合理。将之前局部作用域的
detached线程改为通过静态成员tray_thread管理,并在重新启动前先join()既有线程,逻辑正确,配合end_tray()的join()形成对称的 RAII 关闭路径。
end_tray_called = false与后续tray_thread.join()之间不存在并发安全问题,前提是init_tray_threaded与end_tray不会被并发调用(当前代码确实如此)。src/logging.cpp (1)
60-83: 关闭顺序符合 Boost.Log 官方推荐。代码中的
remove_sink → stop → flush → reset顺序恰好匹配 Boost.Log 文档的规范写法:先从 core 移除 sink 防止新记录入库,再停止异步 feeding 线程,再冲刷剩余记录,最后释放 shared_ptr 触发析构。对
file_sink_ptr(同步 sink)仅调用flush不调用stop是正确的,因为同步 sink 没有 worker 线程。这有助于实现可靠的程序关闭。
| // Note: this only fires on a normal `return` from main. If the program | ||
| // crashes mid-run (uncaught exception, AV, abort/terminate), atexit is | ||
| // not invoked, so the service supervisor still observes a non-zero exit | ||
| // code and can restart Sunshine as usual. | ||
| std::atexit([]() { | ||
| TerminateProcess(GetCurrentProcess(), | ||
| static_cast<UINT>(g_final_exit_code.load(std::memory_order_acquire))); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 检查 lifetime::desired_exit_code 在何处被写入,确认早期 return 路径是否会更新它
rg -nP --type=cpp -C2 '\bdesired_exit_code\b'Repository: AlkaidLab/foundation-sunshine
Length of output: 1958
🏁 Script executed:
# 查看 main 函数的完整结构,特别是第 140-485 行及关键的 return 语句
head -n 20 src/main.cpp | tail -n 5 # 文件头确认
wc -l src/main.cpp # 文件行数Repository: AlkaidLab/foundation-sunshine
Length of output: 194
🏁 Script executed:
# 查看 main 函数的主要结构,包括 atexit 注册和各个 return 语句
sed -n '140,200p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 2451
🏁 Script executed:
# 查看第 220-240 行附近的 return 7 和 return fn->second 语句
sed -n '220,240p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 743
🏁 Script executed:
# 查看第 390-420 行附近的 return -1 和 return lifetime::desired_exit_code
sed -n '390,420p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 1057
🏁 Script executed:
# 查看第 476-485 行(atexit 注册后到 main 返回的部分)
sed -n '476,485p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 514
🏁 Script executed:
# 查看 lifetime::desired_exit_code 的初始值
sed -n '80,95p' src/entry_handler.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 530
🏁 Script executed:
# 查找所有对 lifetime::desired_exit_code 的赋值或修改
rg -nP --type=cpp 'desired_exit_code\s*[=.]' -B2 -A2Repository: AlkaidLab/foundation-sunshine
Length of output: 723
🏁 Script executed:
# 查看第 170-185 行的 --version 和 config::parse return 路径
sed -n '170,185p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 511
🏁 Script executed:
# 确认 main 函数的完整入口和签名
sed -n '130,160p' src/main.cppRepository: AlkaidLab/foundation-sunshine
Length of output: 1237
🏁 Script executed:
# 查看是否存在其他地方会修改 g_final_exit_code(除了第 482 行)
rg -nP --type=cpp 'g_final_exit_code\s*\.' -B1 -A1Repository: AlkaidLab/foundation-sunshine
Length of output: 472
🏁 Script executed:
# 确认 exit_sunshine 函数被调用的位置,以及何时会改变 lifetime::desired_exit_code
rg -nP --type=cpp 'exit_sunshine\s*\(' -B2 -A2 | head -n 50Repository: AlkaidLab/foundation-sunshine
Length of output: 2320
🏁 Script executed:
# 查看完整的 main 函数(从行 133 到 485),以确认没有其他同步 g_final_exit_code 的地方
sed -n '133,250p' src/main.cpp | grep -n 'return\|g_final_exit_code'Repository: AlkaidLab/foundation-sunshine
Length of output: 388
🏁 Script executed:
# 再检查一次所有修改 g_final_exit_code 的地方
rg -nP --type=cpp 'g_final_exit_code' src/Repository: AlkaidLab/foundation-sunshine
Length of output: 342
严重缺陷:atexit handler 会把所有早期非零退出码吞成 0,掩盖真实失败。
g_final_exit_code 的写入仅发生在 main 即将正常返回的第 482 行,但 atexit lambda 在 main 早期(第 143-147 行)就已注册,因此对所有 return 路径都会生效。结果是:
- 第 226 行
return 7;(未知命令)→ atexit 用 0 调用TerminateProcess,调用方看到 exit code 0 - 第 229 行
return fn->second(...);(子命令返回码)→ 同样被覆盖为 0 - 第 397 行
return -1;(HTTP 初始化失败)→ 同样被覆盖为 0 - 第 416 行
return lifetime::desired_exit_code;(早期 shutdown_event 命中)→ 被覆盖为 0(因为 g_final_exit_code 未同步)
所有失败都被伪装成成功退出,使服务管理器、监控系统等无法通过退出码判断程序是否成功。必须确保每一条 return 之前同步更新 g_final_exit_code,或将 main 体重构为单一返回点。
附带建议(次要):
TerminateProcess(GetCurrentProcess(), …)跳过 DLL 的DLL_PROCESS_DETACH通知;同样能强制设定退出码的ExitProcess更温和,建议替换。- atexit 在 main 早期注册导致它的执行排序在大部分全局/静态对象析构之后。由于
TerminateProcess立即结束进程,那些析构都不会运行,可能泄漏命名互斥量、文件锁等跨进程资源。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/main.cpp` around lines 140 - 147, The atexit handler registered early
uses TerminateProcess with g_final_exit_code which is only set just before
normal return, causing early non-zero returns to be masked as zero; fix by
removing or delaying the early atexit registration and ensure g_final_exit_code
is set (and std::atomic synchronized) before any return path, or refactor main
into a single exit point that updates g_final_exit_code then calls the
termination helper; also replace TerminateProcess(GetCurrentProcess(), ...) with
ExitProcess(...) in the final termination handler to avoid skipping
DLL_PROCESS_DETACH and reduce resource leak risk.
There was a problem hiding this comment.
不是掩盖真实错误,这里的用户意图就是要完全退出。不然sunshine service会重新重新拉起sunshine.exe,而这个修改不会影响运行中的异常退出。
There was a problem hiding this comment.
@Yundi339,感谢澄清设计意图!理解了——核心逻辑是:
- 用户主动退出 →
g_final_exit_code在最终return前被更新为lifetime::desired_exit_code(通常为 0)→ atexit 调用TerminateProcess(0)→ 服务管理器看到 0,不重启 - 运行中异常崩溃(未捕获异常、AV、abort 等)→ atexit 不会被调用 → 进程以非零码退出 → 服务管理器触发重启
这个设计逻辑是自洽的。唯一一个值得留意的边界情况是启动阶段失败(例如第 397 行的 return -1 HTTP 初始化失败),此时 g_final_exit_code 尚未被更新,atexit 仍会以 0 调用 TerminateProcess,服务不会重启。不过如果这是期望行为(避免在配置错误时服务无限重启),则完全合理,无需改动。
✏️ Learnings added
Learnt from: Yundi339
Repo: AlkaidLab/foundation-sunshine PR: 640
File: src/main.cpp:140-147
Timestamp: 2026-05-11T01:56:23.184Z
Learning: In AlkaidLab/foundation-sunshine (src/main.cpp), the atexit handler using TerminateProcess with g_final_exit_code is intentional design: when Sunshine exits normally (user-initiated), g_final_exit_code is set to lifetime::desired_exit_code (0) just before the final return, causing the service supervisor to see exit code 0 and NOT restart. For abnormal crashes/exceptions, atexit is not invoked so the supervisor sees a non-zero code and restarts as desired. Early startup failure returns (e.g. return -1 for HTTP init) also result in exit code 0 via atexit, which is also intentional to avoid restart loops on config errors.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
预期行为
场景 流程 exit_code
正常退出(托盘退出 / SIGTERM / WM_ENDSESSION) RAII → 静态析构 → atexit → TerminateProcess(0) 0 ✅ 不重启
退出码非零的清退(出错但走 main return) RAII → 静态析构 → atexit → TerminateProcess(non-zero) 非零 ✅ 重启
运行中真崩溃(AV/abort/terminate/未捕获异常) atexit 不触发 → 走原 ntdll 异常路径 0xC0000005 ✅ 重启