diff --git a/docs/pse51-matrix.md b/docs/pse51-matrix.md index acf9c01..f44e565 100644 --- a/docs/pse51-matrix.md +++ b/docs/pse51-matrix.md @@ -16,16 +16,9 @@ PSE51 interfaces fare on top of that base. ## Conformance status Mazu's PSE51-oriented userspace ABI is feature-complete: every -mandatory PSE51 syscall is wired and exercised by selftests. - -One narrow gap remains that is not blocking and has reasonable -default behavior today: - -1. **`pthread_attr_*` libc family**. Strictly a libc-side - concern. The kernel ABI accepts the resolved (entry, arg, - prio) tuple and exposes per-thread `setschedparam` / - `getschedparam`; once a PSE51 libc lands it can synthesize - attr objects on top of the existing kernel surface. +mandatory PSE51 syscall is wired and exercised by selftests, and +the `pthread_attr_*` family ships as a header-only library at +`include/mazu/pthread.h` on top of the kernel surface. Deliberate deviations (not fix-it gaps): @@ -42,13 +35,14 @@ mlock / munlock range form, fsync / fdatasync, sched_setscheduler / _getscheduler, SIGEV_THREAD_ID timer delivery, CLOCK_THREAD_CPUTIME_ID, CLOCK_PROCESS_CPUTIME_ID, absolute-timespec timed waits, thread-exit trampoline (with -magic verification). +magic verification), pthread_attr_* family (header-only +library at `include/mazu/pthread.h` plus explicit-priority +spawn via `SYS_THREAD_CREATE_EXPLICIT`. Public docs should describe the implementation as "bounded PSE51-compatible userspace core". The subset framing is honest because Mazu intentionally exceeds PSE51 with multi-process and -filesystem support, and because the realtime-signals queue and -libc attr family are out of scope for the kernel layer. +filesystem support. ## What Mazu ships today (PSE51-relevant) @@ -120,12 +114,12 @@ sync handle table (`kernel/sync/sync_handle.c`). | Interface (POSIX) | Mazu syscall | Status | Notes | |---|---|---|---| | `pthread_self` | `SYS_THREAD_SELF` | implemented | Returns the caller's `CAP_TYPE_THREAD` small-int handle. | -| `pthread_create` | `SYS_THREAD_CREATE` | implemented | PROC_THREAD_MAX = 4. Slot reservation under `proc_table_lock`, per-thread stack VA inside the proc slot. Returns a fresh `CAP_TYPE_THREAD` handle. Priority inherits from creator; an explicit priority arg ABI is a future extension. | +| `pthread_create` | `SYS_THREAD_CREATE` / `SYS_THREAD_CREATE_EXPLICIT` | implemented | PROC_THREAD_MAX = 4. Slot reservation under `proc_table_lock`, per-thread stack VA inside the proc slot. Returns a fresh `CAP_TYPE_THREAD` handle. `SYS_THREAD_CREATE` keeps the original two-argument ABI and always inherits the creator's base priority. `SYS_THREAD_CREATE_EXPLICIT` is the opt-in extension: a2 == 0 inherits, values in [1, CONFIG_SCHED_NPRIO] set explicit priority (a2 - 1), a value above CONFIG_SCHED_NPRIO returns EINVAL, and a value that would exceed the creator's base priority returns EPERM. | | `pthread_join` | `SYS_THREAD_JOIN` | implemented | Blocks on `target->td_join_wq`; atomically claims `EXITED -> REAPED` via cmpxchg before reaping. EDEADLK on self-join, ESRCH on unknown thread handle, EINVAL on detached/already-reaped, EINTR on cancellation. | | `pthread_detach` | `SYS_THREAD_DETACH` | implemented | Tries `JOINABLE -> DETACHED` first; if the target already exited, claims `EXITED -> REAPED` and reaps inline. Either claim wakes pending joiners. | | `pthread_exit` | `SYS_THREAD_EXIT` | implemented | Last-thread exit collapses into `proc_exit`; non-last exit unwinds the thread's robust futex list. A user thread that returns from its entry function lands on the per-process unmapped trampoline at `signal_trampoline_pc(p)+4`; the trap handler synthesizes `SYS_THREAD_EXIT(0)`, so an implicit return is equivalent to an explicit pthread_exit. | | `pthread_setschedparam` / `_getschedparam` | `SYS_THREAD_SETSCHEDPARAM` / `_GETSCHEDPARAM` | implemented-with-mazu-abi | Take a `CAP_TYPE_THREAD` handle (0 = self) and a scalar priority. Privilege bound: cannot raise above caller's own base priority. | -| `pthread_attr_*` | (libc) | stubbed | Attribute objects (`setstack`, `setdetachstate`, `setschedpolicy`, `setschedparam`, `setinheritsched`) are user-space libc concerns, but a "PSE51 complete" claim requires them to exist somewhere in the toolchain image. Mazu does not ship a libc with these wrappers today. The kernel ABI accepts the resolved (entry, arg, stack, prio) tuple; once a libc lands, this row flips to `not-applicable`. | +| `pthread_attr_*` | (libc) | implemented | Header-only library at `include/mazu/pthread.h`. Covers `pthread_attr_init` / `_destroy` / `_setdetachstate` / `_getdetachstate` / `_setinheritsched` / `_getinheritsched` / `_setschedpolicy` / `_getschedpolicy` / `_setschedparam` / `_getschedparam` / `_setstacksize` / `_getstacksize` / `_setstack` / `_getstack`. All functions return positive errno on failure (POSIX convention). Stack address selection is delegated to the kernel (shared-VA model); `pthread_attr_setstack` always returns `ENOTSUP`, and `pthread_attr_setstacksize` succeeds only for `USER_STACK_SIZE`, so callers do not observe a stack contract the kernel cannot honor. The accompanying `pthread_attr_resolve_create_syscall` and `pthread_attr_resolve_prio_arg` helpers produce the syscall number and a2 encoding for the future `pthread_create` wrapper. | | `pthread_spin_init` / `_lock` / `_trylock` / `_unlock` / `_destroy` | (none) | stubbed | Mazu has kernel-internal spinlocks, but no userspace-visible busy-wait primitive. The `_POSIX_SPIN_LOCKS` macro is therefore intentionally *not* defined and `_SC_SPIN_LOCKS` returns -1 — advertising it would let an app gate on the macro and call absent APIs. Expect a libc-side implementation backed by a futex once threads land, not a kernel syscall. | | `pthread_cancel` / `pthread_setcancelstate` / `pthread_testcancel` | `SYS_THREAD_CANCEL` / `SYS_THREAD_SETCANCELSTATE` / `SYS_THREAD_TESTCANCEL` | implemented | Deferred cancellation: pthread_cancel sets `td_cancel_pending`; the target observes the bit at the next cancellation point and exits with code -ECANCELED. ASYNC type is treated as DEFERRED because Mazu has no in-kernel cancellation points other than blocking syscalls. | @@ -270,8 +264,10 @@ work. They are part of the product, not optional compatibility extensions. The bounded multi-threaded process model is in place: per-thread state migration (signal pending/blocked, signal-frame chain, robust -futex list, errno TLS) and the user-visible pthread surface -(`SYS_THREAD_CREATE` and friends) have both landed, with -`PROC_THREAD_MAX = 4`. The remaining gap is the `pthread_attr_*` -libc family (strictly a libc-side concern; the kernel ABI already -accepts the resolved (entry, arg, prio) tuple). +futex list, errno TLS), the user-visible pthread surface +(`SYS_THREAD_CREATE` and friends, `PROC_THREAD_MAX = 4`), and the +`pthread_attr_*` library have all landed. A future user-mode +`pthread_create` wrapper can honor `PTHREAD_EXPLICIT_SCHED` without +the `SETSCHEDPARAM`-after-create race window by switching from +`SYS_THREAD_CREATE` to `SYS_THREAD_CREATE_EXPLICIT` and passing the +resolved priority in a2 as (prio + 1). diff --git a/include/mazu/errordef.h b/include/mazu/errordef.h index 66071e7..0b71df1 100644 --- a/include/mazu/errordef.h +++ b/include/mazu/errordef.h @@ -113,6 +113,7 @@ static inline struct result result_ok(void) #define ECONNREFUSED 111 #define EHOSTUNREACH 113 #define ECANCELED 125 /* operation canceled (POSIX, pthread_cancel) */ +#define ENOTSUP 95 /* operation not supported (POSIX) */ static inline struct str error_code_str(u16 code) { @@ -199,6 +200,8 @@ static inline struct str error_code_str(u16 code) return STR("No route to host (EHOSTUNREACH)"); case ECANCELED: return STR("Operation canceled (ECANCELED)"); + case ENOTSUP: + return STR("Operation not supported (ENOTSUP)"); default: return STR("Unknown error"); } diff --git a/include/mazu/pthread.h b/include/mazu/pthread.h new file mode 100644 index 0000000..62f7211 --- /dev/null +++ b/include/mazu/pthread.h @@ -0,0 +1,272 @@ +/* SPDX-License-Identifier: MIT */ +/* PSE51 pthread_attr_* surface. + * + * Header-only library that synthesizes POSIX pthread attribute objects on + * top of the Mazu kernel ABI. The kernel exposes the resolved thread tuple + * directly (entry, arg, priority) via SYS_THREAD_CREATE and + * SYS_THREAD_CREATE_EXPLICIT; this header gives a userspace caller the + * POSIX-shaped attribute API to build that tuple. + * + * All functions return a positive errno on failure (POSIX convention) and + * 0 on success. Storage lives entirely inside the caller-provided + * pthread_attr_t; nothing here performs a syscall and nothing here + * allocates. + * + * Mazu specifics: + * - Mutex policy is fixed to priority-inheritance with direct handover, + * so pthread_attr_setschedpolicy accepts SCHED_OTHER, SCHED_FIFO, and + * SCHED_RR (the three POSIX policies Mazu treats as effective FIFO) + * and rejects other values with EINVAL. + * - Stack placement is governed by the shared address-space model: each + * thread slot owns a fixed per-process VA band. The kernel therefore + * exposes exactly one usable stack size (USER_STACK_SIZE) and cannot + * honor a caller-supplied stack region. pthread_attr_setstacksize + * succeeds only for USER_STACK_SIZE; pthread_attr_setstack always + * returns ENOTSUP. + * - The stack-size accessors use Mazu's signed sz (ptrdiff_t) for + * consistency with the rest of the codebase, where POSIX would specify + * size_t. Negative values are rejected by the PTHREAD_STACK_MIN bound + * check; a future libc shim that exposes the size_t form converts at + * the boundary. + */ + +#ifndef MAZU_PTHREAD_H +#define MAZU_PTHREAD_H + +#include +#include +#include +#include +#include + +/* Detach state. POSIX defaults to JOINABLE. */ +#define PTHREAD_CREATE_JOINABLE 0 +#define PTHREAD_CREATE_DETACHED 1 + +/* Inherit-sched. POSIX defaults to INHERIT_SCHED. */ +#define PTHREAD_INHERIT_SCHED 0 +#define PTHREAD_EXPLICIT_SCHED 1 + +/* Minimum stacksize a portable caller may request. POSIX leaves the + * exact value implementation-defined; we pick one page so the value is + * obviously below the kernel's USER_STACK_SIZE (16 KiB). + */ +#ifndef PTHREAD_STACK_MIN +#define PTHREAD_STACK_MIN 4096 +#endif + +/* POSIX scheduling parameter object. PSE51 carries only sched_priority. */ +struct sched_param { + i32 sched_priority; +}; + +/* pthread_attr_t is opaque per POSIX; the fields below are an + * implementation detail. Callers must use the accessor functions. + */ +typedef struct { + u8 detachstate; /* PTHREAD_CREATE_JOINABLE | PTHREAD_CREATE_DETACHED */ + u8 inheritsched; /* PTHREAD_INHERIT_SCHED | PTHREAD_EXPLICIT_SCHED */ + u8 sched_policy; /* SCHED_OTHER | SCHED_FIFO | SCHED_RR */ + i32 sched_priority; /* Range checked against the kernel scheduler. */ + void *stackaddr; /* Reserved for getstack roundtrip; always NULL today. */ + sz stacksize; /* Mirrors the kernel's fixed USER_STACK_SIZE. */ +} pthread_attr_t; + +static inline i32 pthread_attr_init(pthread_attr_t *attr) +{ + if (!attr) + return EINVAL; + attr->detachstate = PTHREAD_CREATE_JOINABLE; + attr->inheritsched = PTHREAD_INHERIT_SCHED; + attr->sched_policy = SCHED_FIFO; + attr->sched_priority = SCHED_PRIO_NORMAL; + attr->stackaddr = NULL; + attr->stacksize = USER_STACK_SIZE; + return 0; +} + +static inline i32 pthread_attr_destroy(pthread_attr_t *attr) +{ + if (!attr) + return EINVAL; + /* No owned resources. Stamp every field so use-after-destroy is + * easier to spot under a debugger or a sanitizer. + */ + attr->detachstate = 0xFF; + attr->inheritsched = 0xFF; + attr->sched_policy = 0xFF; + attr->sched_priority = -1; + attr->stackaddr = (void *) (uptr) -1; + attr->stacksize = -1; + return 0; +} + +static inline i32 pthread_attr_setdetachstate(pthread_attr_t *attr, i32 state) +{ + if (!attr) + return EINVAL; + if (state != PTHREAD_CREATE_JOINABLE && state != PTHREAD_CREATE_DETACHED) + return EINVAL; + attr->detachstate = (u8) state; + return 0; +} + +static inline i32 pthread_attr_getdetachstate(const pthread_attr_t *attr, + i32 *state) +{ + if (!attr || !state) + return EINVAL; + *state = (i32) attr->detachstate; + return 0; +} + +static inline i32 pthread_attr_setinheritsched(pthread_attr_t *attr, + i32 inheritsched) +{ + if (!attr) + return EINVAL; + if (inheritsched != PTHREAD_INHERIT_SCHED && + inheritsched != PTHREAD_EXPLICIT_SCHED) + return EINVAL; + attr->inheritsched = (u8) inheritsched; + return 0; +} + +static inline i32 pthread_attr_getinheritsched(const pthread_attr_t *attr, + i32 *inheritsched) +{ + if (!attr || !inheritsched) + return EINVAL; + *inheritsched = (i32) attr->inheritsched; + return 0; +} + +static inline i32 pthread_attr_setschedpolicy(pthread_attr_t *attr, i32 policy) +{ + if (!attr) + return EINVAL; + /* Mazu maps SCHED_OTHER and SCHED_RR onto SCHED_FIFO at the kernel + * boundary; the attr roundtrip preserves the caller's stated choice + * so a portable program sees what it set. + */ + if (policy != SCHED_OTHER && policy != SCHED_FIFO && policy != SCHED_RR) + return EINVAL; + attr->sched_policy = (u8) policy; + return 0; +} + +static inline i32 pthread_attr_getschedpolicy(const pthread_attr_t *attr, + i32 *policy) +{ + if (!attr || !policy) + return EINVAL; + *policy = (i32) attr->sched_policy; + return 0; +} + +static inline i32 pthread_attr_setschedparam(pthread_attr_t *attr, + const struct sched_param *param) +{ + if (!attr || !param) + return EINVAL; + if (param->sched_priority < SCHED_PRIO_IDLE || + param->sched_priority >= CONFIG_SCHED_NPRIO) + return EINVAL; + attr->sched_priority = param->sched_priority; + return 0; +} + +static inline i32 pthread_attr_getschedparam(const pthread_attr_t *attr, + struct sched_param *param) +{ + if (!attr || !param) + return EINVAL; + param->sched_priority = attr->sched_priority; + return 0; +} + +static inline i32 pthread_attr_setstacksize(pthread_attr_t *attr, sz stacksize) +{ + if (!attr) + return EINVAL; + if (stacksize < PTHREAD_STACK_MIN) + return EINVAL; + if (stacksize != USER_STACK_SIZE) + return ENOTSUP; + attr->stacksize = stacksize; + return 0; +} + +static inline i32 pthread_attr_getstacksize(const pthread_attr_t *attr, + sz *stacksize) +{ + if (!attr || !stacksize) + return EINVAL; + *stacksize = attr->stacksize; + return 0; +} + +static inline i32 pthread_attr_setstack(pthread_attr_t *attr, + void *stackaddr, + sz stacksize) +{ + if (!attr) + return EINVAL; + (void) stackaddr; + (void) stacksize; + /* POSIX semantics require a caller-supplied stack region at + * stackaddr; Mazu's shared-VA model assigns every thread slot a + * fixed kernel-chosen stack VA, so the call cannot be honored. + * Returning ENOTSUP keeps the attr in its previous state and + * lets a portable caller detect the constraint at runtime. + */ + return ENOTSUP; +} + +static inline i32 pthread_attr_getstack(const pthread_attr_t *attr, + void **stackaddr, + sz *stacksize) +{ + if (!attr || !stackaddr || !stacksize) + return EINVAL; + *stackaddr = attr->stackaddr; + *stacksize = attr->stacksize; + return 0; +} + +/* Resolve attr into the syscall number a future pthread_create wrapper + * should use: the historical two-argument SYS_THREAD_CREATE when the + * caller inherits scheduling, or SYS_THREAD_CREATE_EXPLICIT when the + * wrapper must pass an explicit a2 priority encoding. + */ +static inline u64 pthread_attr_resolve_create_syscall( + const pthread_attr_t *attr) +{ + if (!attr || attr->inheritsched != PTHREAD_EXPLICIT_SCHED) + return SYS_THREAD_CREATE; + return SYS_THREAD_CREATE_EXPLICIT; +} + +/* Resolve attr into the explicit-priority value for + * SYS_THREAD_CREATE_EXPLICIT's a2 register: 0 means inherit, otherwise + * (prio + 1). The setters validate sched_priority against the kernel + * range on input, so a well-formed attr always produces a value in + * [1, CONFIG_SCHED_NPRIO]. If a caller manipulates the struct directly + * and parks an out-of-range value, return a sentinel above + * CONFIG_SCHED_NPRIO so the kernel's own bound check rejects with + * EINVAL. Silently demoting to inherit would erase the user's stated + * intent; equally importantly, the (u64) cast of a negative i32 wraps + * to UINT64_MAX, and the naive "+ 1" would then wrap back to 0 and + * mimic the inherit encoding, so the bound check below has to run + * before the cast. + */ +static inline u64 pthread_attr_resolve_prio_arg(const pthread_attr_t *attr) +{ + if (!attr || attr->inheritsched != PTHREAD_EXPLICIT_SCHED) + return 0; + if (attr->sched_priority < 0 || attr->sched_priority >= CONFIG_SCHED_NPRIO) + return (u64) CONFIG_SCHED_NPRIO + 1; + return (u64) attr->sched_priority + 1; +} + +#endif /* MAZU_PTHREAD_H */ diff --git a/include/mazu/syscall.h b/include/mazu/syscall.h index 6457bfb..f3a04d5 100644 --- a/include/mazu/syscall.h +++ b/include/mazu/syscall.h @@ -193,8 +193,9 @@ * this branch. */ #define SYS_SIGQUEUE 100 +#define SYS_THREAD_CREATE_EXPLICIT 101 -#define SYS_NR 101 /* total number of syscalls */ +#define SYS_NR 102 /* total number of syscalls */ /* pthread_setcancelstate state values. */ #define PTHREAD_CANCEL_ENABLE 0 diff --git a/kernel/proc/syscall.c b/kernel/proc/syscall.c index 355c1e6..4ba5647 100644 --- a/kernel/proc/syscall.c +++ b/kernel/proc/syscall.c @@ -2668,14 +2668,43 @@ static i64 sys_sigprocmask_h(struct trap_frame *tf, struct sched_task *td) * (sig pending/blocked, signal-frame chain, robust futex, * exit_code, join waitqueue) lives on struct sched_task. */ -static i64 sys_thread_create_h(struct trap_frame *tf, struct sched_task *td) +static i64 sys_thread_create_common(struct trap_frame *tf, + struct sched_task *td, + bool use_explicit_prio_abi) { if (!td || !td->proc) return -(i64) EPERM; ptr u_entry = (ptr) tf->a0; ptr u_arg = (ptr) tf->a1; - /* Inherit creator's base priority unless caller specifies one. */ - u8 prio = td->td_base_prio; + u8 creator_base_prio = __atomic_load_n(&td->td_base_prio, __ATOMIC_RELAXED); + u8 prio = creator_base_prio; + if (use_explicit_prio_abi) { + /* Priority encoding in a2: + * 0 -> inherit creator's base priority. + * 1..CONFIG_SCHED_NPRIO -> explicit (prio = a2 - 1). + * anything else -> EINVAL. + * + * This lives on a dedicated syscall number so the historical + * SYS_THREAD_CREATE ABI remains a strict two-argument interface. + * Pre-existing callers are not required to clear a2 before + * ecall. The privilege bound matches pthread_setschedparam: + * a thread may not spawn a child above its own base priority. + * + * Snapshot td_base_prio once: a cross-hart setschedparam can + * mutate it between the inherit-default read and the EPERM + * comparison, and using the same snapshot for both keeps the + * decision internally consistent. + */ + u64 a2 = tf->a2; + if (a2 != 0) { + if (a2 > (u64) CONFIG_SCHED_NPRIO) + return -(i64) EINVAL; + u8 explicit_prio = (u8) (a2 - 1); + if (explicit_prio > creator_base_prio) + return -(i64) EPERM; + prio = explicit_prio; + } + } /* Validate the entry point is in an executable VMA. The arg is * an opaque pointer the user passes through; do not validate it. @@ -2693,6 +2722,17 @@ static i64 sys_thread_create_h(struct trap_frame *tf, struct sched_task *td) return cap_get_token(td->proc, new_td->td_cap_slot, CAP_TYPE_THREAD); } +static i64 sys_thread_create_h(struct trap_frame *tf, struct sched_task *td) +{ + return sys_thread_create_common(tf, td, false); +} + +static i64 sys_thread_create_explicit_h(struct trap_frame *tf, + struct sched_task *td) +{ + return sys_thread_create_common(tf, td, true); +} + static bool thread_target_is_live(const struct sched_task *target) { if (!target) @@ -3343,6 +3383,8 @@ static const struct syscall_entry syscall_table[SYS_NR] = { /* Thread management (item 15d) */ [SYS_THREAD_CREATE] = {sys_thread_create_h, SYSCALL_F_NEEDS_PROC}, + [SYS_THREAD_CREATE_EXPLICIT] = {sys_thread_create_explicit_h, + SYSCALL_F_NEEDS_PROC}, [SYS_THREAD_JOIN] = {sys_thread_join_h, SYSCALL_F_NEEDS_PROC}, [SYS_THREAD_DETACH] = {sys_thread_detach_h, SYSCALL_F_NEEDS_PROC}, [SYS_THREAD_EXIT] = {sys_thread_exit_h, SYSCALL_F_NEEDS_PROC}, diff --git a/tests/tests-pse51.c b/tests/tests-pse51.c index 56a5e30..9f8409e 100644 --- a/tests/tests-pse51.c +++ b/tests/tests-pse51.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -385,6 +386,228 @@ static i32 test_pse51_sched_thread_profile(void) } DEFINE_SELFTEST(pse51_sched_thread_profile, test_pse51_sched_thread_profile); +static i32 test_pse51_pthread_attr_profile(void) +{ + pthread_attr_t attr; + i32 state = 0; + i32 inheritsched = 0; + i32 policy = 0; + sz size = 0; + void *addr = NULL; + struct sched_param param; + + /* NULL guards on every entry point. */ + SELFTEST_ASSERT(pthread_attr_init(NULL) == EINVAL, 1); + SELFTEST_ASSERT(pthread_attr_destroy(NULL) == EINVAL, 2); + SELFTEST_ASSERT(pthread_attr_setdetachstate(NULL, 0) == EINVAL, 3); + SELFTEST_ASSERT(pthread_attr_getdetachstate(NULL, &state) == EINVAL, 4); + SELFTEST_ASSERT(pthread_attr_setinheritsched(NULL, 0) == EINVAL, 5); + SELFTEST_ASSERT(pthread_attr_getinheritsched(NULL, &inheritsched) == EINVAL, + 6); + SELFTEST_ASSERT(pthread_attr_setschedpolicy(NULL, SCHED_FIFO) == EINVAL, 7); + SELFTEST_ASSERT(pthread_attr_getschedpolicy(NULL, &policy) == EINVAL, 8); + SELFTEST_ASSERT(pthread_attr_setschedparam(NULL, ¶m) == EINVAL, 9); + SELFTEST_ASSERT(pthread_attr_getschedparam(NULL, ¶m) == EINVAL, 10); + SELFTEST_ASSERT( + pthread_attr_setstacksize(NULL, PTHREAD_STACK_MIN) == EINVAL, 11); + SELFTEST_ASSERT(pthread_attr_getstacksize(NULL, &size) == EINVAL, 12); + SELFTEST_ASSERT( + pthread_attr_setstack(NULL, NULL, PTHREAD_STACK_MIN) == EINVAL, 13); + SELFTEST_ASSERT(pthread_attr_getstack(NULL, &addr, &size) == EINVAL, 14); + + SELFTEST_ASSERT(pthread_attr_init(&attr) == 0, 20); + + /* Defaults match POSIX: joinable, inherit-sched, FIFO, normal prio. */ + SELFTEST_ASSERT(pthread_attr_getdetachstate(&attr, &state) == 0, 21); + SELFTEST_ASSERT(state == PTHREAD_CREATE_JOINABLE, 22); + SELFTEST_ASSERT(pthread_attr_getinheritsched(&attr, &inheritsched) == 0, + 23); + SELFTEST_ASSERT(inheritsched == PTHREAD_INHERIT_SCHED, 24); + SELFTEST_ASSERT(pthread_attr_getschedpolicy(&attr, &policy) == 0, 25); + SELFTEST_ASSERT(policy == SCHED_FIFO, 26); + SELFTEST_ASSERT(pthread_attr_getschedparam(&attr, ¶m) == 0, 27); + SELFTEST_ASSERT(param.sched_priority == SCHED_PRIO_NORMAL, 28); + SELFTEST_ASSERT(pthread_attr_getstacksize(&attr, &size) == 0, 29); + SELFTEST_ASSERT(size == USER_STACK_SIZE, 30); + + /* detachstate: valid round-trip, invalid value rejected. */ + SELFTEST_ASSERT( + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED) == 0, 31); + SELFTEST_ASSERT(pthread_attr_getdetachstate(&attr, &state) == 0, 32); + SELFTEST_ASSERT(state == PTHREAD_CREATE_DETACHED, 33); + SELFTEST_ASSERT(pthread_attr_setdetachstate(&attr, 99) == EINVAL, 34); + SELFTEST_ASSERT(pthread_attr_getdetachstate(&attr, &state) == 0, 35); + SELFTEST_ASSERT(state == PTHREAD_CREATE_DETACHED, 36); + + /* inheritsched: valid round-trip, invalid value rejected. */ + SELFTEST_ASSERT( + pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) == 0, 40); + SELFTEST_ASSERT(pthread_attr_getinheritsched(&attr, &inheritsched) == 0, + 41); + SELFTEST_ASSERT(inheritsched == PTHREAD_EXPLICIT_SCHED, 42); + SELFTEST_ASSERT(pthread_attr_setinheritsched(&attr, 99) == EINVAL, 43); + + /* schedpolicy: all three POSIX policies round-trip; bad value rejected. */ + SELFTEST_ASSERT(pthread_attr_setschedpolicy(&attr, SCHED_OTHER) == 0, 50); + SELFTEST_ASSERT(pthread_attr_getschedpolicy(&attr, &policy) == 0, 51); + SELFTEST_ASSERT(policy == SCHED_OTHER, 52); + SELFTEST_ASSERT(pthread_attr_setschedpolicy(&attr, SCHED_RR) == 0, 53); + SELFTEST_ASSERT(pthread_attr_getschedpolicy(&attr, &policy) == 0, 54); + SELFTEST_ASSERT(policy == SCHED_RR, 55); + SELFTEST_ASSERT(pthread_attr_setschedpolicy(&attr, 0xFF) == EINVAL, 56); + + /* schedparam: bounds checked against the kernel range. */ + param.sched_priority = SCHED_PRIO_IDLE; + SELFTEST_ASSERT(pthread_attr_setschedparam(&attr, ¶m) == 0, 60); + SELFTEST_ASSERT(pthread_attr_getschedparam(&attr, ¶m) == 0, 61); + SELFTEST_ASSERT(param.sched_priority == SCHED_PRIO_IDLE, 62); + param.sched_priority = CONFIG_SCHED_NPRIO - 1; + SELFTEST_ASSERT(pthread_attr_setschedparam(&attr, ¶m) == 0, 63); + param.sched_priority = CONFIG_SCHED_NPRIO; + SELFTEST_ASSERT(pthread_attr_setschedparam(&attr, ¶m) == EINVAL, 64); + param.sched_priority = -1; + SELFTEST_ASSERT(pthread_attr_setschedparam(&attr, ¶m) == EINVAL, 65); + + /* stacksize: below-min rejected; only the kernel's fixed per-thread + * size is accepted. + */ + SELFTEST_ASSERT( + pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN - 1) == EINVAL, 70); + SELFTEST_ASSERT( + pthread_attr_setstacksize(&attr, PTHREAD_STACK_MIN) == ENOTSUP, 71); + SELFTEST_ASSERT(pthread_attr_getstacksize(&attr, &size) == 0, 72); + SELFTEST_ASSERT(size == USER_STACK_SIZE, 73); + SELFTEST_ASSERT(pthread_attr_setstacksize(&attr, USER_STACK_SIZE) == 0, 74); + SELFTEST_ASSERT(pthread_attr_getstacksize(&attr, &size) == 0, 75); + SELFTEST_ASSERT(size == USER_STACK_SIZE, 76); + + /* setstack always returns ENOTSUP. POSIX semantics need a real + * caller-supplied stack region; Mazu's shared-VA model places every + * thread's stack at a fixed kernel-chosen VA, so the call cannot + * be honored. getstack still round-trips whatever pthread_attr_init + * stored. + */ + SELFTEST_ASSERT( + pthread_attr_setstack(&attr, NULL, PTHREAD_STACK_MIN) == ENOTSUP, 80); + SELFTEST_ASSERT(pthread_attr_setstack(&attr, (void *) 0x10000, + PTHREAD_STACK_MIN) == ENOTSUP, + 81); + SELFTEST_ASSERT(pthread_attr_getstack(&attr, &addr, &size) == 0, 82); + SELFTEST_ASSERT(addr == NULL, 83); + + /* Resolve helpers: inherit-sched uses the historical two-argument + * SYS_THREAD_CREATE ABI. EXPLICIT_SCHED selects the dedicated + * SYS_THREAD_CREATE_EXPLICIT entry point and encodes (prio + 1) in + * a2. If a caller has bypassed the setters and parked an out-of-range + * priority, the helper passes it through and the kernel returns + * EINVAL, instead of silently demoting the request to inherit. + */ + SELFTEST_ASSERT( + pthread_attr_setinheritsched(&attr, PTHREAD_INHERIT_SCHED) == 0, 90); + param.sched_priority = SCHED_PRIO_HIGH; + SELFTEST_ASSERT(pthread_attr_setschedparam(&attr, ¶m) == 0, 91); + SELFTEST_ASSERT( + pthread_attr_resolve_create_syscall(&attr) == SYS_THREAD_CREATE, 92); + SELFTEST_ASSERT(pthread_attr_resolve_prio_arg(&attr) == 0, 93); + SELFTEST_ASSERT( + pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) == 0, 94); + SELFTEST_ASSERT(pthread_attr_resolve_create_syscall(&attr) == + SYS_THREAD_CREATE_EXPLICIT, + 95); + SELFTEST_ASSERT( + pthread_attr_resolve_prio_arg(&attr) == (u64) (SCHED_PRIO_HIGH + 1), + 96); + SELFTEST_ASSERT( + pthread_attr_resolve_create_syscall(NULL) == SYS_THREAD_CREATE, 97); + SELFTEST_ASSERT(pthread_attr_resolve_prio_arg(NULL) == 0, 98); + /* Direct-write out-of-range priority encodes to a sentinel above + * CONFIG_SCHED_NPRIO, which the kernel's a2 bound check rejects. + * The negative case is the load-bearing one: a naive (u64) cast + * of i32 -1 plus 1 wraps to 0 and would mimic the inherit + * encoding, so the bound check has to run before the cast. + */ + attr.sched_priority = CONFIG_SCHED_NPRIO + 5; + SELFTEST_ASSERT( + pthread_attr_resolve_prio_arg(&attr) > (u64) CONFIG_SCHED_NPRIO, 99); + attr.sched_priority = -1; + SELFTEST_ASSERT( + pthread_attr_resolve_prio_arg(&attr) > (u64) CONFIG_SCHED_NPRIO, 100); + + SELFTEST_ASSERT(pthread_attr_destroy(&attr) == 0, 101); + return 0; +} +DEFINE_SELFTEST(pse51_pthread_attr_profile, test_pse51_pthread_attr_profile); + +/* SYS_THREAD_CREATE stays a strict two-argument ABI: it ignores a2 so + * pre-existing callers that do not clear unused registers keep working. + * SYS_THREAD_CREATE_EXPLICIT carries the opt-in explicit-priority + * encoding in a2 so a libc pthread_create with PTHREAD_EXPLICIT_SCHED + * can spawn at a non-default priority without the + * SYS_THREAD_SETSCHEDPARAM race window. Encoding: a2 == 0 -> inherit, + * a2 in [1, CONFIG_SCHED_NPRIO] -> explicit prio (a2 - 1). + * Out-of-range -> EINVAL; raising above the creator's base priority + * -> EPERM. + * + * The test passes u_entry = 0, which fails proc_vma_check_access with + * EFAULT after the priority arm of the handler runs. That lets us + * exercise the validation path without standing up an executable VMA + * and a runnable stack. + */ +static i32 test_pse51_thread_create_prio_abi(void) +{ + struct proc *p; + struct sched_task *td; + struct trap_frame tf = {0}; + + SELFTEST_ASSERT(alloc_proc_and_task(&p, &td), 1); + + /* Establish a known creator base priority so the privilege bound + * has a non-trivial threshold to assert against. + */ + td->td_base_prio = SCHED_PRIO_HIGH; + + tf.a7 = SYS_THREAD_CREATE; + tf.a0 = 0; + tf.a1 = 0; + + /* Historical ABI: a2 is ignored, even if it contains garbage. */ + tf.a2 = (u64) CONFIG_SCHED_NPRIO + 1; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EFAULT, 2); + + tf.a7 = SYS_THREAD_CREATE_EXPLICIT; + + /* a2 == 0: inherit; proc_vma_check_access fails first with EFAULT. */ + tf.a2 = 0; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EFAULT, 3); + + /* a2 == 1: explicit prio 0 (IDLE), passes prio check, falls through + * to EFAULT for the same reason. + */ + tf.a2 = 1; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EFAULT, 4); + + /* a2 = creator's base + 1: explicit prio == base, allowed. */ + tf.a2 = (u64) td->td_base_prio + 1; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EFAULT, 5); + + /* a2 = base + 2: explicit prio == base + 1, would raise above the + * creator. EPERM gates this before the entry check. + */ + if ((u64) td->td_base_prio + 2 <= (u64) CONFIG_SCHED_NPRIO) { + tf.a2 = (u64) td->td_base_prio + 2; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EPERM, 6); + } + + /* a2 > CONFIG_SCHED_NPRIO: out of range. */ + tf.a2 = (u64) CONFIG_SCHED_NPRIO + 1; + SELFTEST_ASSERT(syscall_dispatch(&tf, td) == -(i64) EINVAL, 7); + + free_proc_and_task(p, td); + return 0; +} +DEFINE_SELFTEST(pse51_thread_create_prio_abi, + test_pse51_thread_create_prio_abi); + static i32 test_pse51_signal_profile(void) { struct proc *p; diff --git a/tests/tests-syscall.c b/tests/tests-syscall.c index 29fbf4c..8134fa0 100644 --- a/tests/tests-syscall.c +++ b/tests/tests-syscall.c @@ -1859,7 +1859,9 @@ static i32 selftest_sigqueue_abi_numbering(void) "SYS_CAP_REVOKE_DELEGATE must stay at 98"); static_assert(SYS_CAP_GET_TOKEN == 99, "SYS_CAP_GET_TOKEN must stay at 99"); static_assert(SYS_SIGQUEUE == 100, "SYS_SIGQUEUE must be appended at 100"); - static_assert(SYS_NR == 101, "SYS_NR must stay at 101"); + static_assert(SYS_THREAD_CREATE_EXPLICIT == 101, + "SYS_THREAD_CREATE_EXPLICIT must be appended at 101"); + static_assert(SYS_NR == 102, "SYS_NR must stay at 102"); return 0; } DEFINE_SELFTEST(sigqueue_abi_numbering, selftest_sigqueue_abi_numbering);