Skip to content

fix: 修复退出程序后异常重启#640

Merged
Yundi339 merged 4 commits into
masterfrom
fix_fullquit
May 11, 2026
Merged

fix: 修复退出程序后异常重启#640
Yundi339 merged 4 commits into
masterfrom
fix_fullquit

Conversation

@Yundi339
Copy link
Copy Markdown
Collaborator

@Yundi339 Yundi339 commented May 11, 2026

预期行为

场景 流程 exit_code
正常退出(托盘退出 / SIGTERM / WM_ENDSESSION) RAII → 静态析构 → atexit → TerminateProcess(0) 0 ✅ 不重启
退出码非零的清退(出错但走 main return) RAII → 静态析构 → atexit → TerminateProcess(non-zero) 非零 ✅ 重启
运行中真崩溃(AV/abort/terminate/未捕获异常) atexit 不触发 → 走原 ntdll 异常路径 0xC0000005 ✅ 重启

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Review Change Stack

Summary by CodeRabbit

发布说明

  • Bug 修复
    • 改进了 Windows 平台上的应用程序退出代码处理,确保进程终止程序正确识别应用意图的退出状态。
    • 优化了系统托盘线程的生命周期管理,提高了应用关闭时的稳定性。
    • 改进了日志系统异步缓冲的清理流程,确保关闭时的数据完整性。

总体概述

本 PR 改进了三个独立系统的资源管理和关闭流程:系统托盘线程现采用托管式生命周期而非分离式创建,日志系统明确定义了异步接收器的关闭顺序(移除 → 停止 → 刷新 → 重置),Windows 平台通过原子化存储和 atexit 处理程序确保正确的进程退出码传递。

变更

应用生命周期和关闭流程改进

层级 / 文件 摘要
系统托盘线程生命周期管理
src/system_tray.cpp
init_tray_threaded() 改为在启动新线程前加入任何现有的 tray_thread,并将其分配给成员变量而非创建分离线程;end_tray() 在关闭时加入线程并处理异常。
日志异步接收器关闭顺序
src/logging.cpp
logging::deinit() 移除初始 log_flush() 调用,使用单个 core 句柄,对每个接收器执行显式顺序:从核心移除 → 停止异步工作线程 → 刷新待处理记录 → 重置共享指针。
Windows 退出码原子化处理
src/main.cpp
添加 #include <atomic> 和 Windows 头;引入 _WIN32 条件下的全局 std::atomic<int> g_final_exit_code;main() 启动时注册 atexit 回调执行 TerminateProcess;返回前存储 lifetime::desired_exit_code 到原子变量。

预计代码审查工作量

🎯 3 (中等) | ⏱️ ~25 分钟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning 拉取请求未包含描述。虽然标题相关,但完全缺少描述信息。 建议添加详细的拉取请求描述,说明修复的具体问题、实现方案和测试情况。
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰地总结了主要改动:修复程序退出后异常重启的问题。与变更集中处理日志析构、Windows进程退出码和系统托盘线程生命周期的改动直接相关。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix_fullquit

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main.cpp`:
- Around line 140-147: The atexit handler registered early uses TerminateProcess
with g_final_exit_code which is only set just before normal return, causing
early non-zero returns to be masked as zero; fix by removing or delaying the
early atexit registration and ensure g_final_exit_code is set (and std::atomic
synchronized) before any return path, or refactor main into a single exit point
that updates g_final_exit_code then calls the termination helper; also replace
TerminateProcess(GetCurrentProcess(), ...) with ExitProcess(...) in the final
termination handler to avoid skipping DLL_PROCESS_DETACH and reduce resource
leak risk.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: de6e6662-95fa-416c-81bd-f62c6b325d8a

📥 Commits

Reviewing files that changed from the base of the PR and between 5a6950f and d6eb632.

📒 Files selected for processing (3)
  • src/logging.cpp
  • src/main.cpp
  • src/system_tray.cpp
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,c,h}

⚙️ CodeRabbit configuration file

src/**/*.{cpp,c,h}: Sunshine 核心 C++ 源码,自托管游戏串流服务器。审查要点:内存安全、 线程安全、RAII 资源管理、安全漏洞。注意预处理宏控制的平台相关代码。

Files:

  • src/main.cpp
  • src/logging.cpp
  • src/system_tray.cpp
🪛 Clang (14.0.6)
src/main.cpp

[error] 6-6: 'atomic' file not found

(clang-diagnostic-error)

🔇 Additional comments (3)
src/system_tray.cpp (2)

1235-1245: 正确修复:避免静态 std::thread 在析构时 std::terminate

显式 join() 取代之前的"分离后无人管理"模式,是关闭流程修复重启问题的关键一环。值得注意几点:

  • join() 无超时。若 tray_loop 因第三方 tray 库实现原因未在 tray_exit() 后及时返回,此处会无限阻塞主线程的关闭路径。当前依赖 main.cpp 中的 atexit/TerminateProcess 作为兜底;如果未来去掉该兜底,建议改用带超时的 std::future + detach() 或在另一线程中执行 join 并设超时。
  • 若有路径让 tray 线程自身的菜单回调间接触发 end_tray(),会触发 std::system_error(resource_deadlock_would_occur) —— 已经被 try/catch 覆盖,行为安全。

1402-1421: 线程生命周期管理改善合理。

将之前局部作用域的 detached 线程改为通过静态成员 tray_thread 管理,并在重新启动前先 join() 既有线程,逻辑正确,配合 end_tray()join() 形成对称的 RAII 关闭路径。

end_tray_called = false 与后续 tray_thread.join() 之间不存在并发安全问题,前提是 init_tray_threadedend_tray 不会被并发调用(当前代码确实如此)。

src/logging.cpp (1)

60-83: 关闭顺序符合 Boost.Log 官方推荐。

代码中的 remove_sink → stop → flush → reset 顺序恰好匹配 Boost.Log 文档的规范写法:先从 core 移除 sink 防止新记录入库,再停止异步 feeding 线程,再冲刷剩余记录,最后释放 shared_ptr 触发析构。

file_sink_ptr(同步 sink)仅调用 flush 不调用 stop 是正确的,因为同步 sink 没有 worker 线程。这有助于实现可靠的程序关闭。

Comment thread src/main.cpp
Comment on lines +140 to +147
// Note: this only fires on a normal `return` from main. If the program
// crashes mid-run (uncaught exception, AV, abort/terminate), atexit is
// not invoked, so the service supervisor still observes a non-zero exit
// code and can restart Sunshine as usual.
std::atexit([]() {
TerminateProcess(GetCurrentProcess(),
static_cast<UINT>(g_final_exit_code.load(std::memory_order_acquire)));
});
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# 检查 lifetime::desired_exit_code 在何处被写入,确认早期 return 路径是否会更新它
rg -nP --type=cpp -C2 '\bdesired_exit_code\b'

Repository: AlkaidLab/foundation-sunshine

Length of output: 1958


🏁 Script executed:

# 查看 main 函数的完整结构,特别是第 140-485 行及关键的 return 语句
head -n 20 src/main.cpp | tail -n 5  # 文件头确认
wc -l src/main.cpp  # 文件行数

Repository: AlkaidLab/foundation-sunshine

Length of output: 194


🏁 Script executed:

# 查看 main 函数的主要结构,包括 atexit 注册和各个 return 语句
sed -n '140,200p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 2451


🏁 Script executed:

# 查看第 220-240 行附近的 return 7 和 return fn->second 语句
sed -n '220,240p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 743


🏁 Script executed:

# 查看第 390-420 行附近的 return -1 和 return lifetime::desired_exit_code
sed -n '390,420p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 1057


🏁 Script executed:

# 查看第 476-485 行(atexit 注册后到 main 返回的部分)
sed -n '476,485p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 514


🏁 Script executed:

# 查看 lifetime::desired_exit_code 的初始值
sed -n '80,95p' src/entry_handler.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 530


🏁 Script executed:

# 查找所有对 lifetime::desired_exit_code 的赋值或修改
rg -nP --type=cpp 'desired_exit_code\s*[=.]' -B2 -A2

Repository: AlkaidLab/foundation-sunshine

Length of output: 723


🏁 Script executed:

# 查看第 170-185 行的 --version 和 config::parse return 路径
sed -n '170,185p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 511


🏁 Script executed:

# 确认 main 函数的完整入口和签名
sed -n '130,160p' src/main.cpp

Repository: AlkaidLab/foundation-sunshine

Length of output: 1237


🏁 Script executed:

# 查看是否存在其他地方会修改 g_final_exit_code(除了第 482 行)
rg -nP --type=cpp 'g_final_exit_code\s*\.' -B1 -A1

Repository: AlkaidLab/foundation-sunshine

Length of output: 472


🏁 Script executed:

# 确认 exit_sunshine 函数被调用的位置,以及何时会改变 lifetime::desired_exit_code
rg -nP --type=cpp 'exit_sunshine\s*\(' -B2 -A2 | head -n 50

Repository: AlkaidLab/foundation-sunshine

Length of output: 2320


🏁 Script executed:

# 查看完整的 main 函数(从行 133 到 485),以确认没有其他同步 g_final_exit_code 的地方
sed -n '133,250p' src/main.cpp | grep -n 'return\|g_final_exit_code'

Repository: AlkaidLab/foundation-sunshine

Length of output: 388


🏁 Script executed:

# 再检查一次所有修改 g_final_exit_code 的地方
rg -nP --type=cpp 'g_final_exit_code' src/

Repository: AlkaidLab/foundation-sunshine

Length of output: 342


严重缺陷:atexit handler 会把所有早期非零退出码吞成 0,掩盖真实失败。

g_final_exit_code 的写入仅发生在 main 即将正常返回的第 482 行,但 atexit lambda 在 main 早期(第 143-147 行)就已注册,因此对所有 return 路径都会生效。结果是:

  • 第 226 行 return 7;(未知命令)→ atexit 用 0 调用 TerminateProcess,调用方看到 exit code 0
  • 第 229 行 return fn->second(...);(子命令返回码)→ 同样被覆盖为 0
  • 第 397 行 return -1;(HTTP 初始化失败)→ 同样被覆盖为 0
  • 第 416 行 return lifetime::desired_exit_code;(早期 shutdown_event 命中)→ 被覆盖为 0(因为 g_final_exit_code 未同步)

所有失败都被伪装成成功退出,使服务管理器、监控系统等无法通过退出码判断程序是否成功。必须确保每一条 return 之前同步更新 g_final_exit_code,或将 main 体重构为单一返回点。

附带建议(次要):

  • TerminateProcess(GetCurrentProcess(), …) 跳过 DLL 的 DLL_PROCESS_DETACH 通知;同样能强制设定退出码的 ExitProcess 更温和,建议替换。
  • atexit 在 main 早期注册导致它的执行排序在大部分全局/静态对象析构之后。由于 TerminateProcess 立即结束进程,那些析构都不会运行,可能泄漏命名互斥量、文件锁等跨进程资源。
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main.cpp` around lines 140 - 147, The atexit handler registered early
uses TerminateProcess with g_final_exit_code which is only set just before
normal return, causing early non-zero returns to be masked as zero; fix by
removing or delaying the early atexit registration and ensure g_final_exit_code
is set (and std::atomic synchronized) before any return path, or refactor main
into a single exit point that updates g_final_exit_code then calls the
termination helper; also replace TerminateProcess(GetCurrentProcess(), ...) with
ExitProcess(...) in the final termination handler to avoid skipping
DLL_PROCESS_DETACH and reduce resource leak risk.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不是掩盖真实错误,这里的用户意图就是要完全退出。不然sunshine service会重新重新拉起sunshine.exe,而这个修改不会影响运行中的异常退出。

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Yundi339,感谢澄清设计意图!理解了——核心逻辑是:

  • 用户主动退出g_final_exit_code 在最终 return 前被更新为 lifetime::desired_exit_code(通常为 0)→ atexit 调用 TerminateProcess(0) → 服务管理器看到 0,不重启
  • 运行中异常崩溃(未捕获异常、AV、abort 等)→ atexit 不会被调用 → 进程以非零码退出 → 服务管理器触发重启

这个设计逻辑是自洽的。唯一一个值得留意的边界情况是启动阶段失败(例如第 397 行的 return -1 HTTP 初始化失败),此时 g_final_exit_code 尚未被更新,atexit 仍会以 0 调用 TerminateProcess,服务不会重启。不过如果这是期望行为(避免在配置错误时服务无限重启),则完全合理,无需改动。


✏️ Learnings added
Learnt from: Yundi339
Repo: AlkaidLab/foundation-sunshine PR: 640
File: src/main.cpp:140-147
Timestamp: 2026-05-11T01:56:23.184Z
Learning: In AlkaidLab/foundation-sunshine (src/main.cpp), the atexit handler using TerminateProcess with g_final_exit_code is intentional design: when Sunshine exits normally (user-initiated), g_final_exit_code is set to lifetime::desired_exit_code (0) just before the final return, causing the service supervisor to see exit code 0 and NOT restart. For abnormal crashes/exceptions, atexit is not invoked so the supervisor sees a non-zero code and restarts as desired. Early startup failure returns (e.g. return -1 for HTTP init) also result in exit code 0 via atexit, which is also intentional to avoid restart loops on config errors.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@Yundi339 Yundi339 merged commit d111033 into master May 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant