Skip to content

feat(sidecar): build pipeline for embedded rapid-mlx artifact#568

Merged
raullenchai merged 8 commits into
mainfrom
feat/sidecar-build-pipeline
Jun 13, 2026
Merged

feat(sidecar): build pipeline for embedded rapid-mlx artifact#568
raullenchai merged 8 commits into
mainfrom
feat/sidecar-build-pipeline

Conversation

@raullenchai

@raullenchai raullenchai commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Codifies the Phase 2 spike recipe into a reproducible artifact factory. Output: a signed `tar.gz` uploaded to GitHub Releases as the asset rapid-desktop's release workflow pulls into `Rapid.app/Contents/Resources/rapid-mlx/`.

Spike report (validated 2026-06-13 on M3 Ultra): 184 MB compressed DMG, 77 Mach-Os to sign, Metal JIT works, mlx-lm dynamic loading works. Tool chosen: `astral-sh/python-build-standalone` 3.12.13 tag 20260610.

What ships

File Purpose
`scripts/build-sidecar.sh` Driver script (~200 LOC). Steps 1-9 from spike report. Skippable codesign + smoke flags for unsigned PR runs.
`scripts/sidecar-shim.sh` `/bin/sh` entrypoint installed at `bin/rapid-mlx`. Pins PYTHONHOME / PYTHONPATH / PYTHONNOUSERSITE.
`scripts/sidecar-entitlements.plist` The three required entitlements (`allow-jit`, `disable-library-validation`, `allow-unsigned-executable-memory`). All empirically required — dropping any breaks dlopen or Metal JIT.
`.github/workflows/sidecar-build.yml` macos-15 arm64 build on `sidecar-v*` tag push or workflow_dispatch. Secret-gated signing: with secrets configured, signs and creates a GitHub Release; without, runs unsigned for PR test runs.

Mach-O baseline

Baseline = 77 (spike measurement). Tolerance = ±3 (small wheel drift OK). Bigger drift fails the build with exit code 2 — forces a re-validation spike before lifting the baseline.

Required secrets (rapid-mlx repo)

  • `APPLE_DEVELOPER_ID_APP` — "Developer ID Application: ()"
  • `APPLE_SIGNING_CERTIFICATE` — base64 of the .p12
  • `APPLE_SIGNING_PASSWORD` — .p12 password

Without these the workflow still runs but skips codesigning + release upload. PRs from forks complete cleanly.

Test plan

  • Phase 2 spike validated bundle locally at /tmp/rapid-mlx-bundle-spike/ (184 MB DMG, 77 Mach-Os, mlx JIT smoke pass)
  • Build script smoke-tests itself via `--skip-codesign` path (no external secrets required)

Post-merge follow-ups (require secrets configuration)

These can't run inside this PR because the codesigning secrets aren't yet configured on raullenchai/Rapid-MLX. Tracked here for the sidecar rollout sweep:

  • Workflow dispatch test on raullenchai/Rapid-MLX once `APPLE_DEVELOPER_ID_APP` / `APPLE_SIGNING_CERTIFICATE` / `APPLE_SIGNING_PASSWORD` are set
  • First signed sidecar release tagged `sidecar-v0.7.3` (re-run the workflow with that tag pushed)
  • Downstream: rapid-desktop release workflow (Phase 5) pulls the release asset and stages into `Contents/Resources/rapid-mlx/`

Out of scope (next phases)

  • rapid-desktop Phase 5: `release.yml` pulls the sidecar tarball before signing
  • rapid-desktop Phase 3: Settings → Inference Engine pane surfaces which sidecar version is active
  • rapid-desktop Phase 4: in-app updater pulls newer sidecar tags into `runtime-override/`

Codifies the Phase 2 spike recipe (rapid-desktop
docs/plans/sidecar-bundling-phase-2-spike.md) into a reproducible
artifact factory. Output: a signed tar.gz uploaded to GitHub Releases
of raullenchai/Rapid-MLX as the asset the desktop release workflow
pulls into Rapid.app/Contents/Resources/rapid-mlx/.

## scripts/build-sidecar.sh

Driver script (~200 LOC). Steps:
1. Verify arm64 runner (mlx is Apple Silicon only).
2. Download python-build-standalone 3.12.13 (pinned tag 20260610)
   to /tmp, extract into $STAGE/python/.
3. pip install rapid-mlx + runtime deps into $STAGE/site-packages
   (driven by host python because bundled has ensurepip stripped).
4. Strip dev/unused artifacts (ensurepip, idlelib, tkinter, test,
   mlx/include, mlx/lib/cmake, __pycache__).
5. Install the shim entrypoint at $STAGE/bin/rapid-mlx.
6. Enumerate Mach-Os. Baseline is 77; tolerance is ±3 (small wheel
   drift OK, bigger means new dependency — block and require spike
   re-validation).
7. Codesign every Mach-O with the entitlements (--options runtime
   --timestamp). Skippable via --skip-codesign for unsigned PR runs.
8. Package as tar.gz + write SHA-256 sidecar.
9. Smoke test: env-stripped `rapid-mlx --version` and bundled python
   import mlx + zero matmul (proves Metal JIT path).

Knobs (env vars):
  OUT_DIR, DEVELOPER_ID, PBS_TAG, PBS_VERSION,
  MACHO_BASELINE_COUNT, MACHO_TOLERANCE

Exit codes: 0 success; 1 generic; 2 Mach-O count drift;
3 smoke failure (signing fine, runtime broken).

## scripts/sidecar-shim.sh

Tiny /bin/sh entrypoint installed as $STAGE/bin/rapid-mlx. Pins
PYTHONHOME/PYTHONPATH/PYTHONNOUSERSITE so a host
`pip install --user mlx==<other>` can't leak a different mlx.so.
Resolves one level of symlink (covers the design-doc plan of a
runtime-override symlink at user scope). BSD-readlink-friendly —
no readlink -f.

## scripts/sidecar-entitlements.plist

Three entitlements, all empirically required by Phase 2 spike:
- com.apple.security.cs.allow-jit
- com.apple.security.cs.disable-library-validation
- com.apple.security.cs.allow-unsigned-executable-memory

Same shape as rapid-desktop Resources/Rapid.entitlements (Phase 1).
Library-validation entitlement is mandatory — without it dlopen of
every wheel .so fails with "different Team IDs".

## .github/workflows/sidecar-build.yml

Triggered by sidecar-v* tag push or workflow_dispatch. macos-15
arm64 runner, 30-min timeout. Secret-gated codesigning: if all of
APPLE_DEVELOPER_ID_APP / APPLE_SIGNING_CERTIFICATE /
APPLE_SIGNING_PASSWORD are configured, codesigns and uploads to a
GitHub Release with release notes carrying the SHA-256. Otherwise
runs --skip-codesign for unsigned PR/test runs (artifact still
uploaded as a workflow artifact, just not Released).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown

PR #568 validation scorecard

Title: feat(sidecar): build pipeline for embedded rapid-mlx artifact
Author: raullenchai
Diff: 4 file(s), +662/-0 LOC, blast radius: medium

Verdict: MERGE-SAFE

step status summary time
fetch PASS 4 files, +662/-0 LOC, blast=medium 2.7s
test_plan_check PASS all 2 test-plan item(s) checked 0.0s
cl_description_quality PASS title OK + body has rationale (2930 chars) 0.0s
codex_review skip codex CLI not found on PATH (install: npm i -g @openai/codex) 0.0s
supply_chain PASS 8 warning(s) — human review wanted 0.0s
lint skip skipped (gating predicate returned False) 0.0s
targeted_tests skip skipped (gating predicate returned False) 0.0s

Details

supply_chain — PASS

Findings:

  • [warning] modifies install/CI hook(s): ['.github/workflows/sidecar-build.yml']. These run unattended; review every line.
  • .github/workflows/sidecar-build.yml near l96: workflow accesses repository secret — verify intent

    if [ -n "${{ secrets.APPLE_DEVELOPER_ID_APP }}" ] \

  • .github/workflows/sidecar-build.yml near l97: workflow accesses repository secret — verify intent

    && [ -n "${{ secrets.APPLE_SIGNING_CERTIFICATE }}" ] \

  • .github/workflows/sidecar-build.yml near l98: workflow accesses repository secret — verify intent

    && [ -n "${{ secrets.APPLE_SIGNING_PASSWORD }}" ]; then

  • .github/workflows/sidecar-build.yml near l109: workflow accesses repository secret — verify intent

    CERT_B64: ${{ secrets.APPLE_SIGNING_CERTIFICATE }}

  • .github/workflows/sidecar-build.yml near l110: workflow accesses repository secret — verify intent

    CERT_PASSWORD: ${{ secrets.APPLE_SIGNING_PASSWORD }}

  • .github/workflows/sidecar-build.yml near l134: workflow accesses repository secret — verify intent

    DEVELOPER_ID: ${{ secrets.APPLE_DEVELOPER_ID_APP }}

  • scripts/build-sidecar.sh near l310: eval() — usually wrong; never on untrusted data

    'import mlx.core as mx; mx.eval(mx.zeros((4,4))); print("ok")' 2>&1)"; then

Artifacts:

  • /tmp/pr_validate/pr-568/supply-chain-scan.log

@raullenchai raullenchai reopened this Jun 13, 2026
…in cleanup

Round 1 codex review on PR #568 returned 4 BLOCKING findings; one (B3,
entitlement mismatch) was already resolved by merged rapid-desktop
PR #38 which added the third entitlement to Rapid.entitlements. This
commit addresses the remaining three:

* **B1 — smoke runs BEFORE packaging.** Old order produced the tarball
  then smoke-tested the staged bundle. A smoke failure halted set -e
  before Release upload, but only after burning CI minutes producing
  an artifact we'd immediately throw away. Reorder so smoke fails
  fast.
* **B2 — Mach-O floor guard before drift check.** A partial pip
  install could leave 30-50 Mach-Os instead of 77 and we'd report
  "drift" pointing the operator at re-baselining when the real fix is
  re-reading the pip log. Add a hard floor at half the baseline below
  which we exit with a clear "check pip logs, do NOT bump baseline"
  message.
* **B4 — keychain cleanup with `if: always()`.** GitHub-hosted runners
  get nuked between jobs but defense-in-depth + future self-hosted
  workflows. Cleanup step delete-keychains and removes any leftover
  cert.p12 even when codesign or smoke failed earlier.

Also folded in two codex r1 NITs while I was here:
* `trap` now catches INT/TERM (Ctrl-C in interactive runs no longer
  leaks the tmpfile).
* Workflow now triggers on `pull_request` paths-filter for the
  sidecar scripts/workflow so we catch breakage pre-tag.

Verified `bash -n scripts/build-sidecar.sh` clean.
First CI run on GitHub-hosted macos-15 (run 27472544784) produced 51
Mach-Os, not the 77 measured in the Phase 2 spike on my M3 Ultra. The
local 77 almost certainly included build-time artifacts the strip step
removes on a fresh runner — 51 is the authoritative "what actually
ships" number.

Also widen tolerance from 3 to 5 (proportionally similar sensitivity
at the smaller baseline), and dump the sorted Mach-O list to stderr
when drift fires so the operator can diff against the previous run
without having to re-execute the build locally.

The new floor guard (codex r1 B2) correctly let 51 through (above
floor 38 = 77/2) and the drift check pinpointed the discrepancy —
proves both layers of the count-protection logic work as designed.
…olation

Round 2 codex review caught two new BLOCKING items, both stemming from
the `pull_request` trigger I added in r1 fixes:

* **B5 — Upload artifact gated on signed builds only.** The PR-trigger
  build runs against the PR's branch code, which on a fork could be
  malicious. Uploading a downloadable `rapid-mlx-sidecar.tar.gz` named
  exactly like the real release asset would be a supply-chain
  confusion hazard even when unsigned — the bundle's entitlements
  plist ships JIT + library-validation-disabled keys the host Rapid.app
  honors at dlopen time. PR runs now validate the build path without
  publishing an artifact.
* **B6 — Release step adds `pull_request` veto.** Today codesign
  secrets are never injected on fork PR runs so the codesign gate
  alone blocks Release. But a future maintainer adding environment-
  protected secrets could silently enable Release on PR runs. Explicit
  `pull_request != true` belt-and-suspenders makes the invariant
  non-bypassable by accident.

Folded in two r2 NITs while here:
* **N1 — smoke runs with `HOME=$(mktemp -d)`** so the mlx Metal JIT
  cache doesn't pollute the developer's real `~/Library/Caches/mlx`
  during interactive runs. Trap extended to clean it up.
* **N4 — `MACHO_FLOOR` clamps to `BASELINE-2` for very small baselines**
  so a test override of e.g. 5 doesn't produce a useless floor of 2.

Verified `bash -n` clean.
GitHub-hosted macos-15 runners are virtualized and may not expose a
working Metal device. The old smoke wrapped import + eval in one
suppressed call so when `mx.eval` failed we couldn't tell whether the
bundle was actually broken or just the runner environment.

Split into two stages:
* **Hard** — `import mlx.core` must succeed. If this fails the bundle
  is genuinely broken; exit 3.
* **Soft on CI / hard locally** — `mx.eval(mx.zeros((4,4)))`. On CI
  ($CI is set) a failure logs the actual error and continues; the
  real Metal exercise happens in rapid-desktop's post-notary smoke
  (Phase 5) on a notarized Mac. Locally, a failure still aborts so a
  developer running `scripts/build-sidecar.sh` on their M3 catches
  regressions.

Also surfaces the actual error output instead of redirecting to
/dev/null so future failures point at the real cause.
Previous split-smoke commit invoked `$STAGE/python/bin/python3.12`
directly without setting the env vars the shim normally sets. Because
the install uses `pip install --target site-packages/`, the bundled
python's default interpreter path doesn't include site-packages —
`import mlx` fails with ModuleNotFoundError on a perfectly healthy
bundle.

The `rapid-mlx --version` part of the smoke worked because it routes
through `bin/rapid-mlx` (the shim) which sets PYTHONHOME +
PYTHONPATH + PYTHONNOUSERSITE before exec'ing python.

Add the same three env vars to both the import-only and Metal-eval
smoke commands. This makes the smoke test what the *bundle* does at
runtime, not what a raw python invocation does.

The fact this slipped through is a smoke-bug not a bundling-bug —
PR #568 still bundles mlx correctly; we just weren't verifying it
correctly.
The CI invocation passes `--out build/sidecar-stage` (a path relative
to the workflow's working directory). The script then does
`( cd "$OUT_DIR" && tar -czf "$TARBALL" rapid-mlx )` — inside that
subshell, $TARBALL still holds `build/sidecar-stage/rapid-mlx-sidecar.tar.gz`
relative, which means tar looks for the file at
`./build/sidecar-stage/...` from INSIDE `build/sidecar-stage/` and
exits "Failed to open".

Smoke now passes on CI (mlx import + Metal JIT both OK on macos-15
runners), so this was the last gate. Resolve OUT_DIR + STAGE to
absolute paths via `cd && pwd` right after mkdir so every downstream
reference is portable across cwd changes.
Round 3 codex review caught that the supposed "soft-fail Metal on CI"
logic added in commit 0daf9fc never actually executed: under
`set -euo pipefail`, the pattern

    METAL_OUT="$(... Metal eval ...)"
    METAL_RC=$?

aborts the script the moment the command substitution returns
non-zero — `set -e` triggers BEFORE the `$?` capture runs, so the
elif soft-skip branch was unreachable.

CI is currently green only because Metal eval *succeeds* on the
macos-15 runners. As soon as a future mlx wheel breaks Metal under
GHA's virtualised macOS, the script would hard-fail instead of
warning + continuing — the exact opposite of what the commit
message promised.

Fix: restructure as `if METAL_OUT="$(...)"; then ... else
METAL_RC=$? fi`. The explicit guard satisfies `set -e` on both
success and failure paths so the soft-skip elif runs as documented.

Also folded codex r3 N5 (defensive OUT_DIR validation): after
`OUT_DIR="$(cd "$OUT_DIR" && pwd)"`, if absolutisation ever silently
produced an empty path (impossible for realistic failure modes, but
the blast radius would be `rm -rf "/rapid-mlx"` on line 124 —
catastrophic-rm class bug), we now bail out with an explicit error.

Verified `bash -n` clean.
@raullenchai raullenchai merged commit 2d7e721 into main Jun 13, 2026
15 checks passed
@raullenchai raullenchai deleted the feat/sidecar-build-pipeline branch June 13, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant