Skip to content

RDKEMW-12824: Propagate signal from DobbyInit to DobbyDaemon #434

Open
ks734 wants to merge 4 commits intodevelopfrom
topic/RDKEMW-12824
Open

RDKEMW-12824: Propagate signal from DobbyInit to DobbyDaemon #434
ks734 wants to merge 4 commits intodevelopfrom
topic/RDKEMW-12824

Conversation

@ks734
Copy link
Copy Markdown
Contributor

@ks734 ks734 commented Apr 17, 2026

Description

When a container is killed by a signal, DobbyDaemon expects to see a WIFSIGNALED wait status from the container's runc process. However, DobbyInit is PID 1 of the container's PID namespace and the kernel protects it from signals with default disposition - even raise() is silently dropped, making the conventional "reset to SIG_DFL + raise()" approach impossible.

As a result, DobbyInit was exiting with code 0 regardless of the signal received, causing DobbyDaemon to incorrectly treat signal-killed containers as clean exits (WIFEXITED, status 0x0000).

Adopt the POSIX shell convention: record the signal number in a volatile sig_atomic_t from the signal handler, and after all children have been reaped call _exit(128 + signum). On the DobbyDaemon side, detect exit codes in the range 129-192 and synthesise the equivalent WIFSIGNALED wait status.

Test Procedure

Sending a SIGABRT or SIGSEGV or any other fatal message to DobbyInit or another process within the running container results in the app crashing as expected.

Expected:

  1. WIFEXITED should return false
  2. WIFSIGNALED should return true
  3. WCOREDUMP should return true
  4. WTERMSIG should return the signal used.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Other (doesn't fit into the above categories - e.g. documentation updates)

Requires Bitbake Recipe changes?

  • The base Bitbake recipe (meta-rdk-ext/recipes-containers/dobby/dobby.bb) must be modified to support the changes in this PR (beyond updating SRC_REV)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to correctly propagate “signal-killed” container termination causes from DobbyInit (PID 1 in the container PID namespace) up to DobbyDaemon, by encoding the terminating signal as an exit code (128 + signum) and reconstructing a WIFSIGNALED-style wait status in the daemon.

Changes:

  • DobbyInit: record a received/observed terminating signal and _exit(128 + sig) after reaping children.
  • DobbyManager: detect WEXITSTATUS in the 128+sig range and synthesize a WIFSIGNALED-compatible wait status (optionally setting WCOREDUMP).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
daemon/lib/source/DobbyManager.cpp Decodes 128+sig exits into synthesized WIFSIGNALED statuses (plus optional WCOREDUMP).
daemon/init/source/InitMain.cpp Tracks signal receipt/child signal death and exits via the 128+sig convention.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread daemon/init/source/InitMain.cpp
Comment thread daemon/init/source/InitMain.cpp Outdated
Comment thread daemon/lib/source/DobbyManager.cpp
Comment thread daemon/lib/source/DobbyManager.cpp Outdated
Comment thread daemon/lib/source/DobbyManager.cpp
Comment on lines +3187 to +3197
if (WIFEXITED(status))
{
int exitCode = WEXITSTATUS(status);
if (exitCode > 128 && exitCode <= 128 + 64)
{
int sig = exitCode - 128;
AI_LOG_INFO("container '%s' exited with code %d, "
"interpreting as killed by signal %d (%s) "
"(PID 1 namespace init convention)",
id.c_str(), exitCode, sig, strsignal(sig));

Copy link

Copilot AI Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are L1 tests for DobbyManager, but this new “exit code 128+signal => synthesized WIFSIGNALED” behavior appears untested. Adding a focused unit/integration test that exercises onChildExit handling for (a) a signaled container and (b) a normal exit code in the 129–192 range would help prevent regressions and validate the chosen encoding/decoding scheme.

Copilot uses AI. Check for mistakes.
@ks734 ks734 marked this pull request as ready for review April 24, 2026 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants