Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 10 additions & 16 deletions docs/pse51-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,10 @@ PSE51 interfaces fare on top of that base.
Mazu's PSE51-oriented userspace ABI is feature-complete: every
mandatory PSE51 syscall is wired and exercised by selftests.

Two narrow gaps remain that are not blocking and have reasonable
One narrow gap remains that is not blocking and has reasonable
default behavior today:

1. **`sigqueue` per-signal value payload**. Mazu signals are
level-style: a single bit per signal. The wait-for-signal API
(`SYS_SIGSUSPEND`, `SYS_SIGTIMEDWAIT`) is wired and `sysconf`
reports `_POSIX_REALTIME_SIGNALS = 1` (subset), but
`sigqueue` with a payload value is not implemented. Closing
this gap requires a bounded per-signal queue subsystem.
2. **`pthread_attr_*` libc family**. Strictly a libc-side
1. **`pthread_attr_*` libc family**. Strictly a libc-side
concern. The kernel ABI accepts the resolved (entry, arg,
prio) tuple and exposes per-thread `setschedparam` /
`getschedparam`; once a PSE51 libc lands it can synthesize
Expand Down Expand Up @@ -59,7 +53,8 @@ libc attr family are out of scope for the kernel layer.
## What Mazu ships today (PSE51-relevant)

The following PSE51 services are present and exercised by selftests
(`tests/tests-pse51.c`, `tests/tests-syscall.c`,
(`tests/tests-pse51.c` as the consolidated profile suite, with
subsystem regression detail in `tests/tests-syscall.c`,
`tests/tests-mqueue.c`, `tests/tests-posix_timer.c`,
`tests/tests-rwlock.c`, `tests/tests-barrier.c`,
`tests/tests-condvar.c`, `tests/tests-semaphore.c`,
Expand Down Expand Up @@ -144,8 +139,8 @@ sync handle table (`kernel/sync/sync_handle.c`).
| `pthread_sigmask` | `SYS_PTHREAD_SIGMASK` | implemented | Same wire shape as `SYS_SIGPROCMASK`; both operate on the calling thread's `td_sig.blocked`. Distinct syscall numbers so libc can keep `pthread_sigmask` and `sigprocmask` as separate ABI surfaces. |
| `pthread_kill` | `SYS_PTHREAD_KILL` | implemented | Thread-directed signal: bit lands on the named thread's `td_sig.pending` rather than the per-proc `proc_pending` mask. Takes a `CAP_TYPE_THREAD` handle. SIGKILL rejected with EINVAL (must be process-wide). |
| `sigsuspend` | `SYS_SIGSUSPEND` | implemented | Replace blocked mask with the supplied set, yield-loop until a deliverable signal arrives, restore prior mask, return EINTR. |
| `sigtimedwait` / `sigwait` / `sigwaitinfo` | `SYS_SIGTIMEDWAIT` | implemented-with-mazu-abi | Block until any signal in the supplied set is pending; dequeue without invoking the handler; return signo. Honors `struct timespec *` timeout (NULL = wait forever; expired = EAGAIN). |
| `sigqueue` value delivery | (none) | stubbed | Mazu signals are level-style: a single bit per signal in `pending`, no per-signal value queue. The wait API set above advertises `_POSIX_REALTIME_SIGNALS = 1` (subset) but `sigqueue` with a payload value requires an additional bounded queue subsystem. |
| `sigtimedwait` / `sigwait` / `sigwaitinfo` | `SYS_SIGTIMEDWAIT` | implemented-with-mazu-abi | Block until any signal in the supplied set is pending; dequeue without invoking the handler; return signo. Honors `struct timespec *` timeout (NULL = wait forever; expired = EAGAIN). Mazu ABI also accepts an optional payload-out pointer in `a3`; queued `sigqueue` values are surfaced there when present. |
| `sigqueue` value delivery | `SYS_SIGQUEUE` | implemented-with-mazu-abi | Process-directed queued values use a bounded per-signal ring (`SIGQUEUE_MAX_PER_SIGNO` entries per signo, with one extra internal slot reserved so a single in-flight `SYS_SIGTIMEDWAIT` consumer can losslessly roll back a dequeued payload if `copy_to_user` faults after the lock was dropped). Lossless rollback is guaranteed for the single-consumer case; if multiple threads simultaneously fault their rollbacks for the same signo, the helper drops one payload as defense-in-depth and surfaces a plain pending instance so the signal stays observable. Plain `kill` remains level-style and is tracked on a separate `proc_pending_plain` mask so it cannot be silently swallowed by a queued instance of the same signo. Queued payloads are observable via `SYS_SIGTIMEDWAIT` and friends, while one-argument signal handlers still receive only signo. |
| `sigprocmask` (single-threaded) | `SYS_SIGPROCMASK` | implemented | Modify-and-return-old of the calling thread's `td_sig.blocked` under `sig_lock`. SIGKILL cannot be blocked. The mask migrated from per-process to per-thread when `SYS_PTHREAD_SIGMASK` landed; both syscalls now share the same backing field with distinct wire shapes so libc can keep them as separate ABI surfaces. |
| `raise` | (libc) | not-applicable | Library-level wrapper for `kill(getpid(), sig)`; covered by `SYS_KILL`. |

Expand Down Expand Up @@ -249,7 +244,7 @@ feature-test value when implemented, or `-1` when absent):
| `_SC_THREAD_PRIORITY_INHERIT` | `_POSIX_THREAD_PRIO_INHERIT` (200809L) | PI mutex is the only mutex flavor. |
| `_SC_MESSAGE_PASSING` | `_POSIX_MESSAGE_PASSING` (1) | Anonymous queues only. |
| `_SC_SPIN_LOCKS` | -1 | No userspace `pthread_spin_*` surface today, so the macro is intentionally not defined. |
| `_SC_REALTIME_SIGNALS` | `_POSIX_REALTIME_SIGNALS` (1) | Wait-for-signal API present (sigsuspend / sigtimedwait); per-signal value queue (sigqueue) not implemented. |
| `_SC_REALTIME_SIGNALS` | `_POSIX_REALTIME_SIGNALS` (1) | Wait-for-signal API is present. Bounded `sigqueue` payload delivery exists, but via a Mazu-specific extension rather than the full POSIX `siginfo_t` / `SA_SIGINFO` contract. |
| `_SC_THREADS` | `_POSIX_THREADS` (1) | `SYS_THREAD_*` present; `PROC_THREAD_MAX = 4`. |
| `_SC_THREAD_CPUTIME` | `_POSIX_THREAD_CPUTIME` (200809L) | `clock_gettime(CLOCK_THREAD_CPUTIME_ID, ...)` measures the calling thread's accumulated CPU time. |
| `_SC_CPUTIME` | `_POSIX_CPUTIME` (200809L) | `clock_gettime(CLOCK_PROCESS_CPUTIME_ID, ...)` returns the sum across all live threads in the calling process. |
Expand Down Expand Up @@ -277,7 +272,6 @@ The bounded multi-threaded process model is in place: per-thread
state migration (signal pending/blocked, signal-frame chain, robust
futex list, errno TLS) and the user-visible pthread surface
(`SYS_THREAD_CREATE` and friends) have both landed, with
`PROC_THREAD_MAX = 4`. The two remaining gaps are the `sigqueue` payload
queue (requires a bounded per-signal queue subsystem) and the
`pthread_attr_*` libc family (strictly a libc-side concern; the
kernel ABI already accepts the resolved (entry, arg, prio) tuple).
`PROC_THREAD_MAX = 4`. The remaining gap is the `pthread_attr_*`
libc family (strictly a libc-side concern; the kernel ABI already
accepts the resolved (entry, arg, prio) tuple).
35 changes: 31 additions & 4 deletions include/mazu/proc.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,23 +22,50 @@

/* PSE51 signal state. 31 signals (1-31) in a 32-bit bitmask. */
#define SIG_MAX 32
#define SIGQUEUE_MAX_PER_SIGNO 4

/* Internal ring capacity is one greater than the user-visible cap so that a
* sigtimedwait consumer that has already dequeued a payload can always put
* it back if copy_to_user faults after the lock was dropped, even if a
* concurrent sigqueue producer filled the slot we vacated. Producers still
* cap at SIGQUEUE_MAX_PER_SIGNO, so EAGAIN behavior is unchanged for user
* space.
*/
#define SIGQUEUE_RING_CAP (SIGQUEUE_MAX_PER_SIGNO + 1)
typedef void (*sig_handler_fn_t)(i32);
struct sigaction_entry {
sig_handler_fn_t handler;
u32 sa_mask;
u32 sa_flags;
};

struct signal_value_queue {
u64 values[SIGQUEUE_RING_CAP];
u8 head;
u8 tail;
u8 count;
};

/* Per-process signal state. The blocked mask and signal-frame chain live
* per-thread (struct sched_task::td_sig); the disposition table stays
* per-process per POSIX. proc_pending holds process-directed signals that have
* not yet been claimed by any specific thread; the return-to-user delivery
* path folds it into each thread's local pending view, preserving the bit even
* if the thread that the sender first observed has since exited.
* per-process per POSIX.
*
* Process-directed pending state has two distinct sources that must not be
* conflated:
* - proc_pending_plain: kill()-style instances (no queued payload). One bit
* per signo records "at least one plain instance is in flight".
* - queued[signo].count: sigqueue()-style payload instances, FIFO.
* The summary mask proc_pending is the OR of the two and is what the lockless
* return-to-user fast path reads. Writers under sig_lock keep it in sync.
* Consumers (signal_claim_proc_pending_locked, signal_deliver) take exactly
* one source at a time so a plain pending instance cannot be silently dropped
* when a queued instance for the same signo is consumed first.
*/
struct signal_state {
struct sigaction_entry actions[SIG_MAX];
u32 proc_pending;
u32 proc_pending_plain;
struct signal_value_queue queued[SIG_MAX];
};

#define PROC_MAX 16
Expand Down
8 changes: 7 additions & 1 deletion include/mazu/syscall.h
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,13 @@
#define SYS_CAP_REVOKE_DELEGATE 98
#define SYS_CAP_GET_TOKEN 99

#define SYS_NR 100 /* total number of syscalls */
/* PSE51 sigqueue(): queued process-directed signal with payload. Appended at
* the end of the syscall table so the rest of the numbering stays stable across
* this branch.
*/
#define SYS_SIGQUEUE 100

#define SYS_NR 101 /* total number of syscalls */

/* pthread_setcancelstate state values. */
#define PTHREAD_CANCEL_ENABLE 0
Expand Down
7 changes: 4 additions & 3 deletions include/mazu/sysconf.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,10 @@
#define _POSIX_THREAD_CPUTIME 200809L
#define _POSIX_THREADS 1 /* SYS_THREAD_*; PROC_THREAD_MAX = 4 */
/* _POSIX_REALTIME_SIGNALS reports the wait-for-signal API set
* (sigsuspend, sigtimedwait, sigwait, sigwaitinfo). Mazu does not
* yet implement the per-signal value queue (sigqueue), so this
* advertises the subset value 1 rather than 200809L.
* (sigsuspend, sigtimedwait, sigwait, sigwaitinfo). Mazu also has a
* bounded sigqueue-style payload path, but it is exposed through a
* Mazu-specific ABI extension rather than the full POSIX siginfo /
* SA_SIGINFO surface, so this remains the subset value 1.
*/
#define _POSIX_REALTIME_SIGNALS 1
/* _POSIX_SPIN_LOCKS is intentionally not defined: there is no
Expand Down
175 changes: 170 additions & 5 deletions kernel/proc/signal.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
/* Signal delivery for PSE51.
*
* Signal delivery model:
* - Signals are pending bits, not queued (standard POSIX behavior).
* - Plain kill-style signals are pending bits.
* - sigqueue-style process-directed payloads use a bounded per-signo queue.
* - Delivery happens on trap exit (return-to-user path).
* - Handler invocation: save current trap frame on the user stack,
* set up execution to jump to the handler, sys_sigreturn restores.
Expand All @@ -25,10 +26,15 @@ void signal_init(struct signal_state *ss)
* task creation in sched_create_user_task; this initializer is
* for the per-process disposition table only.
*/
ss->proc_pending = 0;
ss->proc_pending_plain = 0;
for (i32 i = 0; i < SIG_MAX; i++) {
ss->actions[i].handler = SIG_DFL;
ss->actions[i].sa_mask = 0;
ss->actions[i].sa_flags = 0;
ss->queued[i].head = 0;
ss->queued[i].tail = 0;
ss->queued[i].count = 0;
}
}

Expand All @@ -37,6 +43,74 @@ static inline bool sig_valid(i32 signo)
return signo > 0 && signo < SIG_MAX;
}

static bool signal_value_queue_push_locked(struct signal_state *ss,
i32 signo,
u64 value)
{
struct signal_value_queue *q = &ss->queued[signo];

/* Producers cap at the user-visible limit; the +1 internal slot is
* reserved for the rollback-after-fault path below.
*/
if (q->count >= SIGQUEUE_MAX_PER_SIGNO)
return false;
q->values[q->tail] = value;
q->tail = (u8) ((q->tail + 1) % SIGQUEUE_RING_CAP);
q->count++;
return true;
}

static bool signal_value_queue_pop_locked(struct signal_state *ss,
i32 signo,
u64 *out_value)
{
struct signal_value_queue *q = &ss->queued[signo];

if (q->count == 0)
return false;
if (out_value)
*out_value = q->values[q->head];
q->head = (u8) ((q->head + 1) % SIGQUEUE_RING_CAP);
q->count--;
return true;
}

/* Push a payload back at the queue head (LIFO insert used only to undo a
* previous pop). Always succeeds for a single in-flight consumer because the
* ring is sized SIGQUEUE_MAX_PER_SIGNO + 1 (the producer cap leaves one slot
* unused for exactly this case). Returns false only on the pathological case
* where multiple consumers race their rollbacks past the reserved slot.
*/
static bool signal_value_queue_push_head_locked(struct signal_state *ss,
i32 signo,
u64 value)
{
struct signal_value_queue *q = &ss->queued[signo];

if (q->count >= SIGQUEUE_RING_CAP)
return false;
q->head = (u8) ((q->head + SIGQUEUE_RING_CAP - 1) % SIGQUEUE_RING_CAP);
q->values[q->head] = value;
q->count++;
return true;
}

/* Refresh the summary proc_pending bit for signo from the underlying state.
* Caller must hold p->sig_lock. The atomic ensures the lockless fast-path
* reader (signal_has_deliverable) sees a coherent value.
*/
static inline void sig_refresh_proc_pending_locked(struct signal_state *ss,
i32 signo)
{
bool any = (ss->proc_pending_plain & sig_bit(signo)) ||
ss->queued[signo].count > 0;
if (any)
__atomic_or_fetch(&ss->proc_pending, sig_bit(signo), __ATOMIC_RELAXED);
else
__atomic_and_fetch(&ss->proc_pending, ~sig_bit(signo),
__ATOMIC_RELAXED);
}

static inline bool signal_restore_tf_valid(struct proc *p,
const struct trap_frame *tf)
{
Expand Down Expand Up @@ -64,7 +138,7 @@ static inline void signal_restore_tf(struct trap_frame *dst,
* a newly posted signal at the earliest opportunity.
*
* TD_STATE_SLEEPING (nanosleep): sched_wake_sleeping cancels the sleep
* callout, removes from sleep_list, and enqueues as READY all under
* callout, removes from sleep_list, and enqueues as READY, all under
* sched_lock, which serializes against the normal sleep callout wake.
*
* TD_STATE_BLOCKED / TD_STATE_SEM_WAIT (sync primitives): we cannot
Expand Down Expand Up @@ -167,8 +241,8 @@ i32 signal_send(struct proc *p, i32 signo)
u64 tflags = proc_table_lock_irqsave();
if (p->state != PROC_STATE_FREE && p->state != PROC_STATE_ZOMBIE) {
u64 sflags = proc_sig_lock_irqsave(p);
__atomic_or_fetch(&p->sig_state.proc_pending, sig_bit(signo),
__ATOMIC_RELAXED);
p->sig_state.proc_pending_plain |= sig_bit(signo);
sig_refresh_proc_pending_locked(&p->sig_state, signo);
proc_sig_unlock_irqrestore(p, sflags);
delivered = true;
if (signo == SIGKILL)
Expand All @@ -191,6 +265,97 @@ i32 signal_send(struct proc *p, i32 signo)
return 0;
}

i32 signal_queue_send(struct proc *p, i32 signo, u64 value)
{
if (!p || !sig_valid(signo))
return -(i32) EINVAL;

bool need_kill = false;
bool delivered = false;
i32 rc = 0;

u64 tflags = proc_table_lock_irqsave();
if (p->state != PROC_STATE_FREE && p->state != PROC_STATE_ZOMBIE) {
u64 sflags = proc_sig_lock_irqsave(p);
if (!signal_value_queue_push_locked(&p->sig_state, signo, value))
rc = -(i32) EAGAIN;
else {
sig_refresh_proc_pending_locked(&p->sig_state, signo);
delivered = true;
if (signo == SIGKILL)
need_kill = true;
else
signal_interrupt_task(signal_pick_wake_target_locked(p, signo));
}
proc_sig_unlock_irqrestore(p, sflags);
}
proc_table_unlock_irqrestore(tflags);

if (rc < 0)
return rc;
if (!delivered)
return 0;

if (need_kill) {
proc_exit(p, -SIGKILL);
return 0;
}
return 0;
}

bool signal_claim_proc_pending_locked(struct proc *p,
i32 signo,
u64 *out_value,
bool *out_has_value)
{
if (!p || !sig_valid(signo) ||
(p->sig_state.proc_pending & sig_bit(signo)) == 0)
return false;

/* Prefer a queued sigqueue payload (FIFO). If none is queued but a plain
* kill-style instance is still pending, consume that instead. Each call
* consumes exactly one source so kill() and sigqueue() of the same signo
* can coexist without one silently swallowing the other.
*/
bool had_value =
signal_value_queue_pop_locked(&p->sig_state, signo, out_value);
if (!had_value && (p->sig_state.proc_pending_plain & sig_bit(signo))) {
p->sig_state.proc_pending_plain &= ~sig_bit(signo);
}
if (out_has_value)
*out_has_value = had_value;

sig_refresh_proc_pending_locked(&p->sig_state, signo);
return true;
}

bool signal_restore_proc_pending_locked(struct proc *p,
i32 signo,
u64 value,
bool had_value)
{
if (!p || !sig_valid(signo))
return false;

bool payload_dropped = false;
if (had_value) {
if (!signal_value_queue_push_head_locked(&p->sig_state, signo, value)) {
/* Queue filled up via a concurrent sigqueue after the pop. The
* exact payload is lost, but a same-signo instance is still in
* flight via the queue, so observability of the signal is not
* lost. Surface a plain pending instance as well so the receiver
* is guaranteed to retry.
*/
p->sig_state.proc_pending_plain |= sig_bit(signo);
payload_dropped = true;
}
} else {
p->sig_state.proc_pending_plain |= sig_bit(signo);
}
sig_refresh_proc_pending_locked(&p->sig_state, signo);
return payload_dropped;
}

/* Saved signal context pushed onto the user stack. */
struct signal_frame {
u32 magic;
Expand Down Expand Up @@ -260,7 +425,7 @@ bool signal_deliver(struct sched_task *td, struct trap_frame *tf)
if (thread_pending & sig_bit(signo))
td->td_sig.pending &= ~sig_bit(signo);
else
p->sig_state.proc_pending &= ~sig_bit(signo);
(void) signal_claim_proc_pending_locked(p, signo, NULL, NULL);
sig_handler_fn_t handler = p->sig_state.actions[signo].handler;

if (handler == SIG_IGN) {
Expand Down
Loading
Loading