diff --git a/.claude/commands/README.md b/.claude/commands/README.md new file mode 100644 index 00000000000..3767dac987a --- /dev/null +++ b/.claude/commands/README.md @@ -0,0 +1,49 @@ +# Claude Code slash commands for the mcp-server test suite + +Three AI-assisted workflows wrapping `mcp-server/run-tests.sh` and the meshtastic MCP tools. Each one has a twin in `.github/prompts/` for Copilot users. + +| Slash command | What it does | Copilot equivalent | +| --------------------- | ------------------------------------------------------------------------- | ---------------------------------------- | +| `/test [args]` | Runs the test suite (auto-detects hardware) and interprets failures | `.github/prompts/mcp-test.prompt.md` | +| `/diagnose [role]` | Read-only device health report via the meshtastic MCP tools | `.github/prompts/mcp-diagnose.prompt.md` | +| `/repro [n=5]` | Re-runs one test N times, diffs firmware logs between passes and failures | `.github/prompts/mcp-repro.prompt.md` | + +## Why two surfaces + +The Claude Code commands and Copilot prompts cover the same three workflows but each speaks its host's idiom: + +- **Claude Code** (`/test`) uses `$ARGUMENTS` for pass-through, has direct access to Bash + all MCP tools registered in the user's settings, and runs in the terminal context. +- **Copilot** (`/mcp-test`) runs in VS Code's agent mode; it has terminal + MCP access too but typically asks the operator to confirm inputs interactively. + +A contributor using either IDE gets equivalent assistance. Keep the two in sync when behavior changes — the diff of intent should be minimal. + +## House rules + +- **No destructive writes without explicit operator approval.** Skills that could reflash, factory-reset, or reboot a device must describe the action and stop — the operator authorizes. +- **Interpret failures, don't just echo them.** The skill body should pull firmware log lines from `mcp-server/tests/report.html` (the `Meshtastic debug` section, attached by `tests/conftest.py::pytest_runtest_makereport`) and classify the failure. +- **Keep MCP tool calls sequential per port.** SerialInterface holds an exclusive port lock; two parallel tool calls on the same port deadlock. +- **Never speculate about root cause.** If the evidence doesn't support a classification, say "unknown" and list what you'd need to disambiguate. + +## Adding a new command + +1. Write the Claude Code version at `.claude/commands/.md` with YAML frontmatter: + + ```yaml + --- + description: one-line purpose (used for auto-invocation by the model) + argument-hint: [optional-hint] + --- + ``` + +2. Write the Copilot equivalent at `.github/prompts/mcp-.prompt.md` with: + + ```yaml + --- + mode: agent + description: ... + --- + ``` + +3. Add the row to the table above. Cross-link in both bodies. + +4. Smoke-test on Claude Code first (`/` should appear in autocomplete), then in VS Code Copilot (`/mcp-` in Chat). diff --git a/.claude/commands/diagnose.md b/.claude/commands/diagnose.md new file mode 100644 index 00000000000..d664f631294 --- /dev/null +++ b/.claude/commands/diagnose.md @@ -0,0 +1,68 @@ +--- +description: Produce a device health report using the meshtastic MCP tools (device_info, list_nodes, get_config, short serial log capture) +argument-hint: [role=all|nrf52|esp32s3|] +--- + +# `/diagnose` — device health report + +Call the meshtastic MCP tool bundle and format a structured health report for one or all detected devices. Zero guesswork for the operator. + +## What to do + +1. **Enumerate hardware.** Call `mcp__meshtastic__list_devices(include_unknown=True)`. For each entry where `likely_meshtastic=True`, capture `port`, `vid`, `pid`, `description`. + +2. **Filter by `$ARGUMENTS`**: + - No args, `all` → every likely-meshtastic device. + - `nrf52` → only devices with `vid == 0x239a`. + - `esp32s3` → only devices with `vid == 0x303a` or `vid == 0x10c4`. + - A `/dev/cu.*` path → only that one port. + - Anything else → treat as a substring match against the `port` string. + +3. **For each selected device, in sequence (NOT parallel — SerialInterface holds an exclusive port lock):** + - `mcp__meshtastic__device_info(port=

)` — captures `my_node_num`, `long_name`, `short_name`, `firmware_version`, `hw_model`, `region`, `num_nodes`, `primary_channel`. + - `mcp__meshtastic__list_nodes(port=

)` — count of peers, which ones have `publicKey` set, SNR/RSSI distribution. + - `mcp__meshtastic__get_config(section="lora", port=

)` — region, preset, channel_num, tx_power, hop_limit. + - Optionally, if the device seems unhappy (fails to connect, `num_nodes==1` when ≥2 are plugged in, missing firmware*version), open a short firmware log window: `mcp__meshtastic__serial_open(port=

, env=)`, wait 3s, `serial_read(session_id=, max_lines=100)`, `serial_close(session_id=)`. The env should be inferred from the VID map in `mcp-server/run-tests.sh` (nrf52 → rak4631, esp32s3 → heltec-v3) unless `MESHTASTIC_MCP_ENV*` is set. + +4. **Hub health** (call once, not per-device): `mcp__meshtastic__uhubctl_list()` — enumerates every USB hub the host can see. Note which hubs advertise `ppps=true` and which hub hosts each Meshtastic device (cross-reference by VID). Flag it in the report if: + - No hub advertises PPPS → `tests/recovery/` can't run on this setup; hard-recovery via `uhubctl_cycle` isn't available. + - A Meshtastic device is on a non-PPPS hub → note it; operator may want to move the device to a PPPS hub to unlock auto-recovery. + - `uhubctl_list` raises `ConfigError: uhubctl not found` → just say `uhubctl not installed` in the report; don't treat as a fault. + +5. **Render per-device report** as: + + ```text + [nrf52 @ /dev/cu.usbmodem1101] fw=2.7.23.bce2825, hw=RAK4631 + owner : Meshtastic 40eb / 40eb + region/band : US, channel 88, LONG_FAST + tx_power : 30 dBm, hop_limit=3 + peers : 1 (esp32s3 0x433c2428, pubkey ✓, SNR 6.0 / RSSI -24 dBm) + primary ch : McpTest + hub : 1-1.3 port 2 (PPPS, uhubctl-controllable) + firmware : no panics in last 3s; NodeInfoModule emitted 2 broadcasts + ``` + + Keep it scannable. If a field is missing or abnormal (no pubkey for a known peer, region=UNSET, num_nodes inconsistent with the hub, device on non-PPPS hub), flag it inline with a short `⚠︎ `. + +6. **Cross-device correlation** (only when >1 device is inspected): + - Do both sides see each other in `nodesByNum`? If one does and the other doesn't, that's asymmetric NodeInfo — flag it. + - Do the LoRa configs match? (region, channel_num, modem_preset should all agree; mismatch = no mesh) + - Do the primary channel NAMES match? Mismatch = different PSK = no decode. + +7. **Recorder slice (cheap, always available).** The mcp-server runs an autouse log recorder that's been collecting from every connected device. Pull two short slices to surface anything weird that's already happened: + - `mcp__meshtastic__logs_window(start="-2m", level="WARN|ERROR|CRIT", max_lines=20)` — recent firmware errors. If empty, say "no recent errors"; don't manufacture concern. + - `mcp__meshtastic__telemetry_timeline(window="1h", field="free_heap", max_points=60)` — heap trend. If `slope_per_min < -50`, flag it and recommend `/leakhunt window=6h` for a deeper read; otherwise just note the current free heap. + - If `recorder_status` shows `running:false` or `files.telemetry.last_ts` is null, note "recorder has no telemetry yet — enable `set_debug_log_api(True)` to populate" and skip this step gracefully. + +8. **Suggest next actions only for specific, recognisable failure modes**: + - Stale PKI pubkey one-way → "run `/test tests/mesh/test_direct_with_ack.py` — the retry + nodeinfo-ping heals this in the test path." + - Region mismatch → "re-bake one side via `./mcp-server/run-tests.sh --force-bake`." + - Device unreachable, reachable via DFU → `touch_1200bps(port=...)` + `pio_flash`. If not even DFU responds AND the device is on a PPPS hub, escalate to `uhubctl_cycle(role=..., confirm=True)`. + - CP2102-wedged-driver on macOS → see the note in `run-tests.sh`. + - Heap slope strongly negative → "run `/leakhunt window=6h` for a full timeline + classification." + +## What NOT to do + +- No writes. No `set_config`, no `reboot`, no `factory_reset`. This is a read-only diagnostic skill — if the operator wants to change state, they'll ask explicitly. +- No `flash` / `erase_and_flash`. Those are separate escalations. +- No holding SerialInterface across tool calls — open, query, close; next device. The port lock is exclusive. diff --git a/.claude/commands/leakhunt.md b/.claude/commands/leakhunt.md new file mode 100644 index 00000000000..ef90b133e24 --- /dev/null +++ b/.claude/commands/leakhunt.md @@ -0,0 +1,103 @@ +--- +description: Hunt for memory leaks (and other slow degradations) by reading the persistent recorder's heap timeline + log slice over a window +argument-hint: [window=1h] [field=free_heap] [variant=local] +--- + + + +# `/leakhunt` — read the recorder, classify a memory leak + +Use the always-on recorder (`mcp-server/.mtlog/`) to read a heap timeline plus the matching log slice and produce a one-page verdict: **steady / slow leak / fragmentation / OOM-imminent**. No firmware changes, no special build flags — the LocalStats telemetry packet that the firmware already broadcasts every ~60 s carries `heap_free_bytes` and `heap_total_bytes`. + +## Two signal paths — pick the right one + +| Path | Build flag | Cadence | Per-thread attribution | Cost | +| --------------------- | ---------------- | -------------- | ---------------------- | ------------------------- | +| LocalStats packet | (default) | ~60 s | No | Free — always on | +| `[heap N]` log prefix | `-DDEBUG_HEAP=1` | every log line | Yes (Thread X leaked) | Bigger flash + log volume | + +Both feed the same `telemetry_timeline(field="free_heap")` query — when DEBUG_HEAP is on, the recorder synthesizes telemetry rows from log prefixes (tagged `source: debug_heap`), so a single timeline call gets whichever signal is available. **For a slow leak diagnosis, the default path is plenty** (60 s cadence over 6 h = 360 points; linear regression over that nails sub-100-byte/min slopes). **DEBUG_HEAP is for attribution** — when the slope is real and you need to know which thread is leaking. + +## What to do + +1. **Parse `$ARGUMENTS`**: optional `window` (default `1h`, accepts `30m`/`6h`/`-3d`/etc.), optional `field` (default `free_heap`; alternates: `total_heap`, `battery_level`, anything in the LocalStats variant), optional `variant` (default `local`; alternates: `device`, `environment`, `power`, `airQuality`, `health`). + +2. **Verify the recorder is alive** — call `mcp__meshtastic__recorder_status`. Check: + - `running == True` + - `files.telemetry.lines > 0` (at least one telemetry packet recorded — if zero, the device hasn't broadcast LocalStats yet OR `set_debug_log_api` has never been on; tell the operator to run `mcp__meshtastic__set_debug_log_api(enabled=True)` and wait one device-update interval) + - `files.telemetry.last_ts` within the last 5 minutes (if older, the device is silent — log that, not "leak detected") + +3. **Detect whether DEBUG_HEAP is active** — `mcp__meshtastic__logs_window(start="-2m", grep=r"\\[heap \\d+\\]", max_lines=3)`. If any line matches, the firmware has the prefix → DEBUG_HEAP is on, expect higher-cadence data and `heap_event` rows. If zero matches over the last 2 minutes, you're on the LocalStats-only path. + +4. **Pull the timeline** — `mcp__meshtastic__telemetry_timeline(window=$window, variant=$variant, field=$field, max_points=200)`. Read: + - `samples` — how many raw points contributed + - `min`, `max` — total swing + - `slope_per_min` — units per minute (linear regression over the whole window) + +5. **Pull the log context for the same window** — `mcp__meshtastic__logs_window(start="-${window}", grep="Heap status|leaked heap|freed heap|out of memory|Alloc an err|panic|abort", max_lines=200)`. These are the strings the firmware emits when something memory-related happens (`DEBUG_HEAP` builds emit `"Heap status:"` and `"leaked heap"` lines; production builds emit `"Alloc an err"` on failure and `"out of memory"` on OOM). + +6. **Pull marker events** so we know if the operator labeled phases — `mcp__meshtastic__events_window(start="-${window}", kind="mark|connection_lost|connection_established")`. If a `connection_lost` overlaps a sharp drop, that's not a leak; that's a reboot. + +6a. **(DEBUG_HEAP only) Per-thread attribution** — `mcp__meshtastic__logs_window(start="-${window}", grep="leaked heap", max_lines=200)`. Each row has a structured `heap_event` field with `{kind, thread, before, after, delta}`. Aggregate by thread: sum the `delta` over the window per thread name. The thread with the largest cumulative negative delta is your suspect. Note the count too — a thread with 50× small leaks is different from 1× big leak. + +7. **Classify** based on what the data says, NOT on what you wish it said. Use these rules in order: + - **Insufficient data** (< 5 samples): say so. Suggest a longer window or longer wait. Stop. + - **Reboot mid-window**: if any `connection_lost` event is present AND `free_heap` jumped UP at that timestamp, the device rebooted. Note it; pre-reboot trend may be a leak but you only have part of the curve. + - **OOM-imminent**: any `Alloc an err=` or `out of memory` line in the log slice. This trumps everything; flag urgently. + - **Slow leak**: `slope_per_min < -50` AND `max - min > 1000` AND no reboot. The heap is monotonically (or near-monotonically) declining. Estimate time-to-zero: `min / -slope_per_min` minutes. Surface it. + - **Fragmentation suspect**: `slope_per_min` close to zero (|x| < 50) BUT min trends down across the window AND the log slice shows `Alloc an err` warnings WITHOUT total OOM. Means free total is OK but largest contiguous block is shrinking. Recommend a `DEBUG_HEAP` build to confirm. + - **Steady**: |slope_per_min| < 50, no error lines. Heap is fine. + - **Recovery curve**: slope is POSITIVE — heap recovered. Either a workload completed or GC fired. Note it; not a leak. + +8. **Report**: + + ```text + /leakhunt window=6h field=free_heap variant=local + ──────────────────────────────────────────────────── + recorder : running, telem last_ts 8s ago + build : DEBUG_HEAP=ON (per-line prefix detected) + samples : 14,200 over 6h (cadence ~1.5s, log-line synth) + free_heap : min 92,344 / max 124,008 / range 31,664 + slope : -82 bytes/min (negative — heap declining) + reboots : none in window + OOM events : none + error lines : 3× "Alloc an err=ESP_ERR_NO_MEM" at +4h12m, +5h08m, +5h44m + thread leaks : (DEBUG_HEAP) MeshPacket -3,124 B over 18 events + Router -1,408 B over 4 events + others -240 B + verdict : SLOW LEAK — primary suspect MeshPacket thread + est. time-to-OOM: ~1,127 min (~18.8 h) at current slope + evidence : (3 log line citations with uptimes) + ``` + + Then: **what to do next.** + - SLOW LEAK, **DEBUG_HEAP off** → recommend rebuilding with the flag and re-running this skill. Concrete one-liner the operator can copy: + ```text + mcp__meshtastic__build(env="", build_flags={"DEBUG_HEAP": 1}) + mcp__meshtastic__pio_flash(env="", port="", confirm=True) + ``` + After flash, set debug_log_api back on and wait one window; re-run `/leakhunt`. + - SLOW LEAK, **DEBUG_HEAP on** → cite the top-leaking thread name from step 6a. Point at the corresponding source file (`grep -rn "ThreadName(\"\")" src/`); the operator decides what to fix. + - FRAGMENTATION SUSPECT → propose pre-allocating any per-packet buffers; or rebuilding with `CONFIG_HEAP_TASK_TRACKING=y` on ESP32 to see who's holding the largest blocks. + - OOM-IMMINENT → flag for immediate attention; don't wait for the next telemetry interval. + - STEADY → say so; stop. Don't invent problems. + +## What NOT to do + +- Don't assume a leak from a single dip. LocalStats fires every ~60 s and the firmware naturally allocates+frees on each broadcast cycle; one packet sees the trough. Look at the slope, not the deltas. +- Don't recommend code changes. This skill diagnoses; the operator decides what to fix. +- Don't enable `set_debug_log_api` automatically — if it's off, telemetry isn't reaching pubsub anyway, and the recorder will be empty. Tell the operator to flip it on and wait, then re-run. +- Don't run heavy workloads to "trigger the leak." The recorder is passive; we read what's there. + +## Companion: `mark_event` for stress runs + +If the operator wants to test under stimulus (e.g. blast 50 broadcasts and see what the heap does), they can frame the experiment with markers: + +```text +mark_event("burst-start") +… run the workload … +mark_event("burst-end") +/leakhunt window=15m +``` + +The markers land in both `events.jsonl` and `logs.jsonl`, so the report can show "free_heap dipped 8 KB during the burst window, recovered to baseline within 2 LocalStats cycles" → not a leak. diff --git a/.claude/commands/repro.md b/.claude/commands/repro.md new file mode 100644 index 00000000000..84513e45b18 --- /dev/null +++ b/.claude/commands/repro.md @@ -0,0 +1,70 @@ +--- +description: Re-run a specific test N times in isolation to triage flakes, diff firmware logs between passes and failures +argument-hint: [count=5] +--- + + + +# `/repro` — flakiness triage for one test + +Re-run a single pytest node ID N times in isolation, track pass rate, and surface what's _different_ in the firmware logs between the passing attempts and the failing ones. Turns "it's flaky, I guess" into "it fails when X, passes when Y." + +## What to do + +1. **Parse `$ARGUMENTS`**: first token is the pytest node id (e.g. `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[nrf52->esp32s3]`); second token is an integer count (default `5`, cap at `20`). If the first token doesn't look like a test path (no `::` and no `tests/` prefix), treat the whole `$ARGUMENTS` as a `-k` filter instead. + +2. **Sanity-check the hub first** (so we're not measuring "nothing plugged in" N times): call `mcp__meshtastic__list_devices`. If the test name contains `nrf52` or `esp32s3` and the matching VID isn't present, stop and report — re-running won't help. + +3. **Loop N times**. For each iteration: + + ```bash + ./mcp-server/run-tests.sh --tb=short -p no:cacheprovider + ``` + + Capture: exit code, duration, and (on failure) the `Meshtastic debug` firmware log section from `mcp-server/tests/report.html`. `-p no:cacheprovider` suppresses pytest's `.pytest_cache` writes so iterations don't influence each other. + +4. **Track a small structured tally**: + + ```text + attempt 1: PASS (42s) + attempt 2: FAIL (128s) ← firmware log 200-line tail captured + attempt 3: PASS (39s) + attempt 4: FAIL (121s) + attempt 5: PASS (41s) + -------------------------------------- + pass rate: 3/5 (60%) | mean duration: 74s + ``` + +5. **On mixed outcomes**: diff the firmware log tails between a representative passing attempt and a representative failing attempt. Focus on: + - Error-level lines only present in failures (`PKI_UNKNOWN_PUBKEY`, `Alloc an err=`, `Skip send`, `No suitable channel`) + - Timing around the assertion event — did a broadcast go out, was there an ACK, did NAK fire? + - Device state fields that changed (nodesByNum entries, region/preset, channel_num) + + Surface the top 3 differences as a "passes when / fails when" table. Don't dump full logs — pull specific lines with uptime timestamps. + +5a. **Archive recorder slices per attempt** (no extra device interaction; the recorder runs autouse). Right after each attempt finishes, capture its `(start_ts, end_ts)` and call `mcp__meshtastic__recorder_export(start=, end=, dest_dir="mcp-server/tests/repro_artifacts//attempt_/")`. This drops a `logs.jsonl`, `telemetry.jsonl`, `packets.jsonl`, and `events.jsonl` snapshot scoped to the attempt window. Use these for cross-attempt diffs in step 5: `jq '.line' logs.jsonl` is faster than re-running the test, and the telemetry slice lets you compare heap behavior across attempts. + +6. **Classify the flake** into one of: + - **LoRa airtime collision** → pass rate improves with fewer concurrent transmitters; propose a `time.sleep` gap or retry bump in the test body. + - **PKI key staleness** → fails on first attempt, passes after self-heal; existing retry loop in `test_direct_with_ack.py` handles this. + - **NodeInfo cooldown** → `Skip send NodeInfo since we sent it <600s ago` in fail-only logs; needs `broadcast_nodeinfo_ping()` warmup. + - **Hardware-specific** (one direction fails, other passes; one device's firmware is older; driver wedged) → specific recovery pointer. For a device that's wedged past `touch_1200bps`, the next escalation is `uhubctl_cycle(role=..., confirm=True)` to hard-power-cycle its hub port (requires `uhubctl` installed). + - **Device went dark mid-run** → fails from some attempt onward, never recovers, firmware log stops arriving. Almost always hardware: a Guru crash + frozen CDC. Hard-power-cycle via `uhubctl_cycle(role=..., confirm=True)` before the next iteration; if that also fails, escalate to replug. + - **Genuinely unknown** → say so; don't invent a root cause. + +7. **Report back** with: + - Pass rate and mean duration. + - Classification + evidence (the specific log lines that support it). + - A suggested next step (re-run with specific args, open `/diagnose`, edit a specific test file, nothing). + +## Examples + +- `/repro tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[esp32s3->nrf52] 10` — runs 10 times, diffs firmware logs. +- `/repro broadcast_delivers` — no `::`, no `tests/`, so interpreted as `-k broadcast_delivers`; runs every matching test the default 5 times. +- `/repro tests/telemetry/test_device_telemetry_broadcast.py 3` — shorter run for a slow test. + +## Constraints + +- Don't exceed `count=20` per invocation — airtime and USB wear add up. If the user asks for 50, negotiate down. +- Don't rebuild firmware as part of triage; flakes that only reproduce under different firmware belong in a separate session. +- If the FIRST attempt fails AND the rest all pass, that's a classic "state leak from a prior test" → say so and suggest running with `--force-bake` or starting from a clean state rather than chasing the first failure. diff --git a/.claude/commands/test.md b/.claude/commands/test.md new file mode 100644 index 00000000000..46a753749a3 --- /dev/null +++ b/.claude/commands/test.md @@ -0,0 +1,47 @@ +--- +description: Run the mcp-server test suite (auto-detects devices) and interpret the results +argument-hint: [pytest-args] +--- + +# `/test` — mcp-server test runner with interpretation + +Run `mcp-server/run-tests.sh` and make sense of the output so the operator doesn't have to. + +## What to do + +1. **Invoke the wrapper.** From the firmware repo root, run: + + ```bash + ./mcp-server/run-tests.sh $ARGUMENTS + ``` + + The wrapper auto-detects connected Meshtastic devices, maps each to its PlatformIO env, exports the required `MESHTASTIC_MCP_ENV_*` env vars, and invokes pytest. If the user passed no arguments, the wrapper supplies a sensible default set (`tests/ --html=tests/report.html --self-contained-html --junitxml=tests/junit.xml -v --tb=short`). A `--report-log=tests/reportlog.jsonl` arg is always appended (unless the operator passed their own). `--assume-baked` is deliberately NOT in the defaults — `test_00_bake.py` has its own skip-if-already-baked check and runs the ~8 s verification by default. Operators can opt into the fast path with `--assume-baked`, or force a reflash with `--force-bake`. + +2. **Read the pre-flight header.** First ~6 lines print the detected hub (role → port → env). If that line reads `detected hub : (none)`, the wrapper will narrow to `tests/unit` only — say so explicitly in your summary so the operator knows hardware tiers were skipped. + +3. **On pass**: one-line summary of the form `N passed, M skipped in `. Don't enumerate the test names — the user can read those. Do mention any SKIPPED tests and name the cause: + - `"role not present on hub"` → device unplugged; operator knows to reconnect. + - `"firmware not baked with USERPREFS_UI_TEST_LOG"` → tests/ui skipped because the macro isn't in firmware yet; suggest `--force-bake`. + - `"uhubctl not installed"` → tests/recovery + peer-offline skipped; suggest `brew install uhubctl` / `apt install uhubctl`. + - `"no PPPS-capable hubs detected"` → tests/recovery skipped because the hub doesn't support per-port power; the tier will never run on that setup. + - `"opencv-python-headless is not installed"` → tests/ui auto-deselected by run-tests.sh; suggest `pip install -e 'mcp-server/.[ui]'`. + +4. **On failure**: for every FAILED test, open `mcp-server/tests/report.html` and extract the `Meshtastic debug` section for that test. pytest-html embeds the firmware log stream + device state dump there; the 200-line firmware log tail is usually enough to explain the failure. Summarise: which test, one-line assertion message, the firmware log lines that matter (things like `PKI_UNKNOWN_PUBKEY`, `Skip send NodeInfo`, `Error=`, `Guru Meditation`, `assertion failed`). For UI-tier failures also glance at `mcp-server/tests/ui_captures///transcript.md` — it records each step's frame + OCR. + +5. **Classify the failure** as one of: + - **Transient/flake**: LoRa collision, timing-sensitive assertion, first-attempt NAK + successful retry pattern. Propose `/repro ` to confirm. + - **Environmental**: device unreachable, port busy, CP2102 driver wedged. Suggest the specific recovery in escalation order: (a) replug USB, (b) `touch_1200bps(port=...)` + `pio_flash` for nRF52 DFU, (c) `uhubctl_cycle(role="nrf52", confirm=True)` when a device is fully wedged past DFU (needs `uhubctl` installed — `baked_single`'s auto-recovery hook does this once automatically). Also check `git status userPrefs.jsonc`. + - **Regression**: same assertion fails repeatedly, firmware log shows a new/unusual error. Surface the diff between expected and observed, identify the module likely responsible. + +6. **Never run destructive recovery automatically.** If a failure looks like it needs a reflash, factory*reset, `uhubctl_cycle`, or USB replug, \_describe what to do* — don't execute. The operator decides. + +## Arguments handling + +- No args → wrapper's defaults (full suite). +- `$ARGUMENTS` passed verbatim to the wrapper, which passes them to pytest. +- Common operator invocations: `/test tests/mesh`, `/test tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip`, `/test --force-bake`, `/test -k telemetry`. + +## Side-effects to mention in summary + +- The session fixture snapshots `userPrefs.jsonc` at session start and restores at teardown (plus on `atexit`). After a clean run, `git status userPrefs.jsonc` should be empty. If the wrapper's pre-flight printed a warning about a stale sidecar, call that out — means a prior session crashed. +- `mcp-server/tests/report.html` and `junit.xml` are regenerated on every run; the HTML is self-contained (shareable). diff --git a/.devcontainer/devcontainer.json b/.devcontainer/devcontainer.json index e3f076ce061..a631832cbd1 100644 --- a/.devcontainer/devcontainer.json +++ b/.devcontainer/devcontainer.json @@ -8,17 +8,21 @@ "features": { "ghcr.io/devcontainers/features/python:1": { "installTools": true, - "version": "3.14" + "version": "3.13" } }, "customizations": { "vscode": { "extensions": [ "ms-vscode.cpptools", - "platformio.platformio-ide", + "Jason2866.esp-decoder", + "pioarduino.pioarduino-ide", "Trunk.io" ], - "unwantedRecommendations": ["ms-azuretools.vscode-docker"], + "unwantedRecommendations": [ + "ms-azuretools.vscode-docker", + "platformio.platformio-ide" + ], "settings": { "extensions.ignoreRecommendations": true } diff --git a/.github/ISSUE_TEMPLATE/Bug Report.yml b/.github/ISSUE_TEMPLATE/Bug Report.yml index bc77e8c1b8c..cdf4823445e 100644 --- a/.github/ISSUE_TEMPLATE/Bug Report.yml +++ b/.github/ISSUE_TEMPLATE/Bug Report.yml @@ -75,11 +75,11 @@ body: - type: checkboxes id: mui attributes: - label: Is this bug report about any UI component firmware like InkHUD or Meshtatic UI (MUI)? + label: Is this bug report about any UI (https://meshtastic.org/docs/configuration/device-uis/) component firmware? options: - - label: Meshtastic UI aka MUI colorTFT - - label: InkHUD ePaper - - label: OLED slide UI on any display + - label: Meshtastic UI aka MUI + - label: InkHUD + - label: BaseUI - type: input id: version diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 24e11bd4ddb..d165f2cdb61 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -13,6 +13,7 @@ Meshtastic is an open-source LoRa mesh networking project for long-range, low-po - **RP2040/RP2350** - Raspberry Pi Pico variants - **STM32WL** - STM32 with integrated LoRa - **Linux/Portduino** - Native Linux builds (Raspberry Pi, etc.) +- **macOS native** - Headless `meshtasticd` on Apple Silicon / x86_64; see `variants/native/portduino/platformio.ini` for Homebrew prereqs + CH341 LoRa setup ### Supported Radio Chips @@ -70,6 +71,163 @@ PKI (Public Key Infrastructure) messages have special handling: - Accepted on a special "PKI" channel - Allow encrypted DMs between nodes that discovered each other on downlink-enabled channels +## Encryption & Key Management + +Meshtastic packets on the air are typically encrypted one of two ways: the **per-channel symmetric** layer (AES-CTR with a shared PSK) for broadcasts and channel traffic, and the **per-peer PKI** layer (X25519 ECDH → AES-256-CCM) for direct messages and remote admin. A channel with a 0-byte PSK (or Ham mode, which wipes PSKs) transmits cleartext — see the size table below. Both are implemented in `src/mesh/CryptoEngine.cpp`; the send/receive dispatch lives in `src/mesh/Router.cpp`; admin authorization lives in `src/modules/AdminModule.cpp`. + +### High-level model + +- **Channels** are symmetric rooms: anyone with the PSK can read any message on the channel. Channel 0 is the "primary" channel and ships with the short-form default PSK on factory devices, forming the public mesh most users join. (The LoRa modem preset `LONG_FAST` lives on `config.lora.modem_preset` and is an independent field — don't conflate "channel 0 default PSK" with the modem preset name.) +- **DMs** addressed to a single node require PKI so that other holders of the channel PSK can't read them. Outside Ham mode, Meshtastic does not fall back to channel-symmetric encryption when the destination public key is unknown. +- **Remote admin** is a DM carrying an `AdminMessage`. The receiver only acts on it if the sender's public key is on its allowlist (`config.security.admin_key[0..2]`). +- **Ham mode** (`owner.is_licensed=true`, where `owner` is the local `meshtastic_User` record) disables PKI entirely and sends cleartext — FCC Part 97 prohibits encryption on amateur bands. +- **No ratchet, no session.** Every packet is encrypted from scratch — a stateless design that matches the high-loss, store-and-forward nature of LoRa. + +### Symmetric channel encryption (AES-CTR) + +`CryptoEngine::encryptPacket` / `decrypt` / `encryptAESCtr` in `src/mesh/CryptoEngine.cpp`. + +- **Cipher**: AES-CTR, AES-128 or AES-256 depending on key length. Same routine in both directions (CTR is a stream cipher, so encrypt == decrypt). +- **Key**: `ChannelSettings.psk` bytes. Size semantics: + - **0 bytes** → no encryption, cleartext on the air + - **1 byte** → short-form index into the well-known `defaultpsk[]` in `src/mesh/Channels.h`. Index 0 = cleartext; 1 = defaultpsk unchanged; 2..255 = defaultpsk with its last byte incremented by (index − 1). This is what the CLI's `--ch-set psk default` produces. + - **16 bytes** → raw AES-128 key + - **32 bytes** → raw AES-256 key + - **2..15 bytes** → zero-padded to 16 and used as AES-128 (with a warn log); **17..31 bytes** → zero-padded to 32 and used as AES-256 (with a warn log). Defensive fallback for malformed PSK input, not something to rely on. +- **Nonce (128 bit)**: `packet_id` (u64 LE) ‖ `from_node` (u32 LE) ‖ `block_counter` (u32, starts at 0). Built in `CryptoEngine::initNonce`. +- **No AEAD**: channel packets carry no MAC, so the channel-hash byte is not an integrity or authenticity check. `Channels::getHash` is a 1-byte XOR-derived hint over the channel name bytes and PSK bytes that helps receivers pick a candidate channel/PSK for decryption. Because it is only a small hint and collisions are easy to find, it should be described purely as a PSK-selection aid, not as a security filter an attacker cannot bypass. +- **Channel 0 is special in one way only**: it's the channel the Router attempts PKI decryption on before falling through to AES-CTR. Non-zero channels always go straight to AES-CTR. + +### PKI encryption for DMs (X25519 ECDH + AES-256-CCM) + +`CryptoEngine::encryptCurve25519` / `decryptCurve25519` in `src/mesh/CryptoEngine.cpp`. + +- **Keypair**: Curve25519 (aka X25519), 32-byte public + 32-byte private. Stored in `config.security.public_key` / `private_key`; the public half is mirrored into `owner.public_key` so it rides along in NodeInfo broadcasts and propagates through the mesh like any other identity field. +- **Key generation** (`generateKeyPair`): stirs `HardwareRNG::fill()` (64 B from platform TRNG when available), the 16-byte `myNodeInfo.device_id`, and a call to `random()` into the rweather/Crypto library's software RNG, then `Curve25519::dh1`. `regeneratePublicKey` recomputes the public half from a known private (used when restoring from backup). +- **Keygen entry points**: at boot, `NodeDB` calls `generateKeyPair` (or `regeneratePublicKey` when a stored private key is present and passes a low-entropy check) **directly** when `!owner.is_licensed` and `config.lora.region != UNSET`. `ensurePkiKeys` wraps the same logic for runtime/admin flows — it's the path `AdminModule::handleSetConfig` runs when first assigning a valid region or when security config is written; **do not assume it's the universal boot-time gate**, because the NodeDB path bypasses it. +- **Handshake**: `Curve25519::dh2(local_private, remote_public) → 32-byte shared secret → SHA-256 → 32-byte AES-256 key`. Recomputed per packet. The SHA-256 step is effectively a KDF over the raw ECDH output. +- **Cipher**: AES-256-CCM via `aes_ccm_ae` / `aes_ccm_ad` (`src/mesh/aes-ccm.cpp`). MAC length (the `M` parameter) is **8 bytes**. No AAD — the MAC covers ciphertext only. +- **Nonce (13 bytes / 104 bit)**: `aes_ccm_ae`/`aes_ccm_ad` use a 13-byte CCM nonce (`L = 2` is hardcoded in `src/mesh/aes-ccm.cpp`), not a 16-byte nonce. For PKI packets, `CryptoEngine::initNonce(fromNode, packetNum, extraNonce)` starts from the usual packet-derived nonce material, then overwrites nonce bytes `4..7` with a fresh 32-bit `extraNonce = random()`. The effective nonce bytes are therefore: bytes `0..3` = `packet_id`, bytes `4..7` = transmitted `extraNonce`, bytes `8..11` = `from_node`, byte `12` = `0x00`. The receiver reconstructs the same 13-byte nonce from the packet metadata plus the appended `extraNonce`. +- **Wire overhead**: 12 bytes appended to the ciphertext = 8-byte MAC ‖ 4-byte extraNonce. Defined as `MESHTASTIC_PKC_OVERHEAD = 12` in `src/mesh/RadioInterface.h`. Only the 4-byte `extraNonce` is sent; the rest of the 13-byte CCM nonce is reconstructed from packet fields as described above. The Router's send path checks this overhead against `MAX_LORA_PAYLOAD_LEN` before committing to PKI. +- **Send selection** (`Router::send`): the sender enters the PKI path when **all** hold — we're the originator AND not Ham mode AND not Portduino simradio AND not on the `serial`/`gpio` channels (unless the packet is already marked `pki_encrypted`) AND `config.security.private_key.size == 32` AND destination is a single node (not broadcast) AND the portnum isn't infrastructure. `TRACEROUTE_APP`, `NODEINFO_APP`, `ROUTING_APP`, and `POSITION_APP` are routed through channel encryption even when DMed (these need to be readable by relaying peers). Once on the PKI path, if the destination's public key isn't in our NodeDB the send **fails** with `PKI_SEND_FAIL_PUBLIC_KEY` — it does not silently fall back to channel encryption. If the client explicitly set `pki_encrypted=true` and any condition blocks PKI, the send fails with `PKI_FAILED`. +- **Receive selection** (`Router::perhapsDecode`): try PKI decrypt first when `channel == 0` AND `isToUs(p)` AND not broadcast AND both peers have public keys in NodeDB AND `rawSize > MESHTASTIC_PKC_OVERHEAD`. On success the packet gets `pki_encrypted=true` stamped and the sender's public key copied into `p->public_key` for downstream authorization. + +### Remote admin authorization + +Implemented in `src/modules/AdminModule.cpp` → `handleReceivedProtobuf`. The authorization check runs in this order: + +1. **Response messages** — if `messageIsResponse(r)` is true (the payload is a response to one of our earlier admin requests), it's accepted without any further check. The in-file comment flags this as a known-untightened gap: a stricter implementation would remember which `public_key` we last queried and reject responses that don't match. +2. **Local admin** — `mp.from == 0` (phone app over BLE, serial CLI, internal module); never travels over the air. **Rejected** if `config.security.is_managed` is true, because managed devices expect admin to arrive over the air through an authorized remote path. +3. **Legacy admin channel (deprecated)** — the packet arrived on a channel named literally `"admin"`. Gated by `config.security.admin_channel_enabled`; returns `NOT_AUTHORIZED` if the flag is false. Kept for backward compatibility; new deployments should use PKI admin. +4. **PKI admin (preferred for remote)** — `mp.pki_encrypted == true` AND `mp.public_key` matches one of `config.security.admin_key[0..2]` (up to three authorized 32-byte Curve25519 public keys, typically copied from the admin node's own `user.public_key`). +5. **Fallthrough** → `NOT_AUTHORIZED`. + +On top of authorization, any remote admin message that **mutates** state (not a request, not a response) also has to pass a session-key check (`checkPassKey`): the client must first pull a fresh 8-byte `session_passkey` via `get_admin_session_key_request`, then echo that passkey back in the mutating message. The device rotates the passkey after 150 s and rejects values older than 300 s — a narrow anti-replay window on top of the PKI layer. + +`config.security.is_managed = true` disables **local** admin writes (`mp.from == 0` is rejected). It does not by itself force every admin action through PKI — the legacy `"admin"` channel still authorizes remote admin when `config.security.admin_channel_enabled == true`. The AdminModule refuses to persist `is_managed=true` unless at least one `admin_key` is populated — a deliberate guard against operators locking themselves out. + +### Key-rotation hazards (actions that invalidate peers) + +- **`factory_reset_device`** (the "full" variant, calls `NodeDB::factoryReset(eraseBleBonds=true)`) → **wipes** the X25519 private key; a fresh keypair is generated on the next region-set. Every existing peer holds the old public key, so DMs to this node silently fail PKI decrypt until every peer re-exchanges NodeInfo. +- **`factory_reset_config`** (the "partial" variant, calls `NodeDB::factoryReset()` with `eraseBleBonds=false`) → **preserves** the X25519 private key in `installDefaultConfig(preserveKey=true)`; the public key is zeroed and gets rebuilt from the preserved private key on the next boot via the NodeDB path's `regeneratePublicKey` call. Identity is preserved and the mesh does not need to re-exchange keys. +- **`region=UNSET → valid region`** → `ensurePkiKeys` runs inside the same `handleSetConfig` path; missing keys get generated at that moment. +- **Ham mode transitions** — entering Ham mode (`user.is_licensed=true`) runs `Channels::ensureLicensedOperation`, which **wipes every channel PSK** (all traffic becomes cleartext) and disables the legacy admin channel. The X25519 private key is preserved on the device but not used because `Router::send` skips PKI when `owner.is_licensed` is true. Leaving Ham mode re-enables PKI with the preserved keypair but does not restore the wiped channel PSKs — the operator has to re-set them. +- **Channel 0 PSK change** → every peer must re-learn the channel hash; cached NodeInfo becomes temporarily unreachable until the next broadcast. +- **`security.private_key` blanked via admin** → regenerates both halves (unless in Ham mode) and propagates the new public key via NodeInfo. + +## NodeDB Layout (v25) + +`DEVICESTATE_CUR_VER = 25`, `DEVICESTATE_MIN_VER = 24`. The on-device NodeDB was split in v25 into a slim header table plus four optional satellite stores. Older v24 saves auto-migrate at boot. Old training-data instincts (`node->user.long_name`, `node->position.latitude_i`, `node->is_favorite`, `node->device_metrics.battery_level`) are wrong now — the fields aren't there. Read this section before touching anything that walks `nodeDB->meshNodes`. + +### Slim `NodeInfoLite` + +`UserLite` is flattened onto `NodeInfoLite` (no nested sub-message); `position` and `device_metrics` are removed entirely (tags reserved). MAC address is dropped. Long names are capped at 25 chars (`max_size:25` in `deviceonly.options`); `hw_model` and `role` are `int_size:8`. Encoded size dropped from ~166 B → ~105 B per node. + +Booleans are bit-packed into `NodeInfoLite.bitfield`. **Do not read or write the bits directly** — use the inline helpers in `src/mesh/NodeDB.h`: + +```cpp +nodeInfoLiteHasUser(n) // bit 5 — user fields populated +nodeInfoLiteIsFavorite(n) // bit 3 +nodeInfoLiteIsIgnored(n) // bit 4 +nodeInfoLiteIsMuted(n) // bit 1 +nodeInfoLiteIsLicensed(n) // bit 6 — Ham mode peer +nodeInfoLiteIsKeyManuallyVerified(n) // bit 0 +nodeInfoLiteHasIsUnmessagable(n) // bit 8 — "is_unmessagable was sent" +nodeInfoLiteIsUnmessagable(n) // bit 7 +// via_mqtt is bit 2 (mask exposed; predicate uses the mask directly) + +nodeInfoLiteSetBit(n, NODEINFO_BITFIELD_IS_FAVORITE_MASK, true); // setter +``` + +### Satellite stores + +Four `std::unordered_map` members on `NodeDB`, each gated by its own build flag: + +| Map | Value type | Build flag | +| ----------------- | ------------------------------- | ---------------------------------- | +| `nodePositions` | `meshtastic_PositionLite` | `MESHTASTIC_EXCLUDE_POSITIONDB` | +| `nodeTelemetry` | `meshtastic_DeviceMetrics` | `MESHTASTIC_EXCLUDE_TELEMETRYDB` | +| `nodeEnvironment` | `meshtastic_EnvironmentMetrics` | `MESHTASTIC_EXCLUDE_ENVIRONMENTDB` | +| `nodeStatus` | `meshtastic_StatusMessage` | `MESHTASTIC_EXCLUDE_STATUSDB` | + +Defaults are ON (i.e., maps **excluded**) for STM32WL only — see `src/mesh/mesh-pb-constants.h`. On every other arch all four maps are present. When excluded, the map member is absent and the corresponding accessors return `false`. + +All four maps are guarded by **`mutable concurrency::Lock satelliteMutex`** — concurrent access from receive threads, the phone API state machine, and the renderer is the rule, not the exception. + +### Accessor convention + +**Never hand out pointers into the maps.** Use the copy-out accessors on `NodeDB`: + +```cpp +bool copyNodePosition(NodeNum, meshtastic_PositionLite &out) const; +bool copyNodeTelemetry(NodeNum, meshtastic_DeviceMetrics &out) const; +bool copyNodeEnvironment(NodeNum, meshtastic_EnvironmentMetrics &out) const; +bool copyNodeStatus(NodeNum, meshtastic_StatusMessage &out) const; +``` + +Each takes the lock, copies the value if present, returns `false` if the entry is absent or the DB is excluded. Pass-by-out-param is deliberate — pointer-style accessors would invite UAF and lock-leak bugs across the renderer. The "has any X" convenience predicates (`hasValidPosition` etc.) are implemented in terms of these. + +Writers go through `setNodeStatus`, `updatePosition`, `updateTelemetry` (which dispatches on `which_variant` for device vs environment metrics) — these own the lock and the eviction hooks. + +### Eviction + +Every code path that drops a node from the header table must also evict the satellites. The single chokepoint is `eraseNodeSatellites(NodeNum)`; it's already called from `getOrCreateMeshNode`'s oldest-boring eviction, `removeNodeByNum`, both branches of `resetNodes`, `cleanupMeshDB`, `addFromContact`'s ignored-branch, and `AdminModule`'s `set_ignored_node`. Add new eviction sites here, not by calling `.erase()` directly. + +### Sync flow: thin NodeInfo + post-COMPLETE_ID replay (no opt-in) + +There is no capability flag and no special "gradient" nonce. The **default** sync flow is: + +1. Config / module-config / channel / metadata segments (same as before). +2. `STATE_SEND_OWN_NODEINFO` — **our own** NodeInfo, still bundled with our position and device_metrics (because the replay snapshot excludes our own NodeNum). Emitted via `ConvertToNodeInfo(lite)`. +3. `STATE_SEND_OTHER_NODEINFOS` — every other peer's NodeInfo, **always thin** (no `position`, no `device_metrics`). Emitted via `ConvertToNodeInfoThin(lite)`. +4. `STATE_SEND_FILEMANIFEST` → `STATE_SEND_COMPLETE_ID` — the phone sees `config_complete_id` and treats sync as done. +5. `STATE_SEND_PACKETS` — live mesh packets, with a trailing replay drain interleaved. The replay drain walks four cached satellite stores in order (positions → telemetry → environment → status) and emits each cached entry as an ordinary `MeshPacket` on the matching portnum (`POSITION_APP`, `TELEMETRY_APP` device + environment variants, `NODE_STATUS_APP`). These are indistinguishable on the wire from live mesh traffic, so clients need no special handling — any code that already updates UI on `POSITION_APP` etc. works. + +`PhoneAPI::sendConfigComplete()` arms `replayPhase = REPLAY_PHASE_POSITIONS` for default/full sync and `SPECIAL_NONCE_ONLY_NODES`, while `SPECIAL_NONCE_ONLY_CONFIG` skips replay. The drain runs inside `STATE_SEND_PACKETS` via `popReplayPacket()`, lower priority than live traffic. When all four phases drain, `replayPhase` flips back to `REPLAY_PHASE_IDLE` and the snapshot vectors get `shrink_to_fit`ed. + +STM32WL and any other build with all four `MESHTASTIC_EXCLUDE_*DB` flags set produces zero replay packets — `popReplayPacket` advances through each phase in microseconds without emitting anything. + +Special nonces that still mean something: + +- `SPECIAL_NONCE_ONLY_CONFIG` (69420) — skip node sync entirely, just config. +- `SPECIAL_NONCE_ONLY_NODES` (69421) — skip config segments, jump straight to `STATE_SEND_OWN_NODEINFO`. Still gets the post-COMPLETE_ID replay drain. + +There are no other reserved nonces; everything else is a fresh random `want_config_id` from the client. + +### v24 → v25 migration + +The legacy migration code lives in **`src/mesh/NodeDBLegacyMigration.cpp`**, not in `NodeDB.cpp`. It owns the `meshtastic_NodeDatabase_Legacy` callback and `NodeDB::migrateLegacyNodeDatabase()`. The legacy proto descriptor is `protobufs/meshtastic/deviceonly_legacy.proto` (only included by the migration TU). The boot path peeks the file's leading version tag, runs the migration if `version < 25`, then re-saves in v25 layout. The legacy descriptor is scheduled for removal once `DEVICESTATE_MIN_VER` is bumped. + +### Read-site rules of thumb + +- Never `node->position.X` / `node->device_metrics.X` — those fields no longer exist. Pull from the satellite map via `copyNodePosition` / `copyNodeTelemetry`. +- Never `node->user.long_name` — `long_name`, `short_name`, `public_key`, `hw_model`, `role`, `macaddr` (gone), `is_licensed`, `is_unmessagable` are flat on `NodeInfoLite`. +- Never `node->is_favorite` / `node->is_ignored` / `node->via_mqtt` / `node->is_key_manually_verified` — use the bitfield helpers. +- Never assume `nodeDB->getMeshNode(num)->position.time` — call `copyNodePosition` and check the return. +- Don't lock `satelliteMutex` yourself in renderer code; the copy-out accessors already do. + +Unit tests for the conversion layer live in `test/test_type_conversions/test_main.cpp` (Unity) — bitfield round-trips, `long_name` truncation, thin-vs-full conversions. Add cases there when extending the schema. + ## Project Structure ``` @@ -80,7 +238,7 @@ firmware/ │ │ ├── NodeDB.* # Node database management │ │ ├── Router.* # Packet routing │ │ ├── Channels.* # Channel management -│ │ ├── CryptoEngine.* # AES-CCM encryption +│ │ ├── CryptoEngine.* # AES-CTR (channels) + X25519 ECDH→AES-256-CCM (PKI for DMs/admin) │ │ ├── *Interface.* # Radio interface implementations │ │ ├── api/ # WiFi/Ethernet server APIs (ServerAPI, PacketAPI) │ │ ├── http/ # HTTP server (WebServer, ContentHandler) @@ -131,6 +289,8 @@ firmware/ - Prefer `LOG_DEBUG`, `LOG_INFO`, `LOG_WARN`, `LOG_ERROR` for logging - Use `assert()` for invariants that should never fail - C++17 features are available (`std::optional`, structured bindings, `if constexpr`, etc.) +- **Keep code comments minimal — one or two lines, max.** Comment only when the _why_ isn't obvious from the code; never restate what the next line does. No multi-paragraph block comments explaining straightforward changes. The diff and commit message carry the rationale; the code carries the behavior. +- **Use `Throttle` for time-based rate limiting, not raw `millis()` math.** `src/mesh/Throttle.h` provides `Throttle::isWithinTimespanMs(lastMs, intervalMs)` (returns true while inside the cooldown) and `Throttle::execute(&lastMs, intervalMs, func)` (function-pointer form that updates the timestamp on fire). Use these for any "did N ms pass since X" check — raw `millis() > lastMs + N` is rollover-unsafe (breaks after ~49.7 days) and inconsistent with the rest of the codebase. The helpers compute `now - lastMs` with unsigned subtraction, which wraps correctly. ### Naming Conventions @@ -296,6 +456,23 @@ Key defines in variant.h: ## Build System +## Agent Tooling Baseline + +Mirror counterpart: `AGENTS.md` under **Agent Tooling Baseline**. + +To reduce avoidable agent mistakes, assume these tools are available (or install them before significant repo work): + +- **Required CLI basics**: `bash`, `git`, `find`, `grep`, `sed`, `awk`, `xargs` +- **Strongly recommended**: `rg` (ripgrep) for fast file/text search, `jq` for JSON processing +- **Build/test tools**: `python3`, `pip`, virtualenv (`python3 -m venv`), `platformio` (`pio`) +- **Containerized native testing**: `docker` (fallback for non-Linux hosts; macOS can also build natively via `pio run -e native-macos`) + +Fallback expectations for agents: + +- If `rg` is unavailable, use `find` + `grep` instead of failing. +- For native tests on hosts without Linux deps, prefer `./bin/test-native-docker.sh`. +- The simulator helper script is `./bin/test-simulator.sh`. + Uses **PlatformIO** with custom scripts: - `bin/platformio-pre.py` - Pre-build script @@ -307,6 +484,7 @@ Build commands: pio run -e tbeam # Build specific target pio run -e tbeam -t upload # Build and upload pio run -e native # Build native/Linux version +pio run -e native-macos # Build headless macOS meshtasticd (Homebrew prereqs in variants/native/portduino/platformio.ini) ``` ### Build Manifest @@ -429,6 +607,8 @@ Most workflows can be triggered manually via `workflow_dispatch` for testing. ## Testing +### Native unit tests (C++) + Unit tests in `test/` directory with 12 test suites: - `test_crypto/` - Cryptography @@ -446,6 +626,176 @@ Run with: `pio test -e native` Simulation testing: `bin/test-simulator.sh` +Quick entry point for new test modules: `test/README.md` (native unit-test authoring guide, skeleton, pitfalls, and setup checklist). + +### Hardware-in-the-loop tests (`mcp-server/tests/`) + +Separate pytest suite that exercises real USB-connected Meshtastic devices. See the **MCP Server & Hardware Test Harness** section below for invocation, tier layout, and agent usage rules. + +## MCP Server & Hardware Test Harness + +The `mcp-server/` directory houses a firmware-aware [MCP](https://modelcontextprotocol.io/) server plus a pytest-based integration suite. AI agents that speak MCP get a well-defined tool surface for flashing, configuring, and inspecting physical Meshtastic devices — use it instead of hand-rolling `pio` or `meshtastic --port` calls where possible. `mcp-server/README.md` is the operator-facing setup doc; this section is the agent-facing usage contract. + +The repo registers the server via `.mcp.json` at the repo root — Claude Code picks it up automatically once `mcp-server/.venv/` is built (`cd mcp-server && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'`). + +### When to use which surface + +| Goal | Tool | +| ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | +| Find a connected device | `mcp__meshtastic__list_devices` | +| Read a live node's config/state | `mcp__meshtastic__device_info`, `list_nodes`, `get_config` | +| Mutate a device (owner, region, channels, reboot) | `set_owner`, `set_config`, `set_channel_url`, `reboot`, `shutdown`, `factory_reset` — all require `confirm=True` | +| Flash firmware to a variant | `pio_flash` (any arch) or `erase_and_flash` (ESP32 factory install) | +| Stream serial logs while debugging | `serial_open` → `serial_read` loop → `serial_close` | +| Administer `userPrefs.jsonc` build-time constants | `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest` | +| Run the regression suite | `./mcp-server/run-tests.sh` (or `/test` slash command) | +| Diagnose a specific device | `/diagnose [role]` slash command (read-only) | +| Triage a flaky test | `/repro [count]` slash command | + +**One MCP call per port at a time.** `SerialInterface` holds an exclusive OS-level lock on the serial port for its lifetime. If a `serial_*` session is open on `/dev/cu.usbmodem101`, calling `device_info` on the same port will fail fast pointing at the active session. Sequence calls: open → read/mutate → close, then next device. Never parallelize tool calls on the same port. + +### MCP tool surface (43 tools) + +Grouped by purpose. Full argument shapes in `mcp-server/README.md`; a few high-value signatures are called out here. + +- **Discovery & metadata**: `list_devices`, `list_boards`, `get_board` +- **Build & flash**: `build`, `clean`, `pio_flash`, `erase_and_flash` (ESP32 only), `update_flash` (ESP32 OTA), `touch_1200bps` +- **Serial sessions** (long-running, 10k-line ring buffer): `serial_open`, `serial_read`, `serial_list`, `serial_close` +- **Device reads**: `device_info`, `list_nodes` +- **Device writes**: `set_owner`, `get_config`, `set_config`, `get_channel_url`, `set_channel_url`, `send_text`, `send_input_event` (inject a button/key press via the firmware's InputBroker), `set_debug_log_api`; destructive/power-state writes require `confirm=True`: `reboot`, `shutdown`, `factory_reset` +- **userPrefs admin** (build-time constants, not runtime config): `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest`, `userprefs_testing_profile` +- **Vendor escape hatches**: `esptool_chip_info`, `esptool_erase_flash`, `esptool_raw`, `nrfutil_dfu`, `nrfutil_raw`, `picotool_info`, `picotool_load`, `picotool_raw` +- **USB power control** (via `uhubctl`, per-port PPPS toggle): `uhubctl_list` (read-only), `uhubctl_power(action='on'|'off', confirm=True)`, `uhubctl_cycle(delay_s, confirm=True)`. Target by raw `(location, port)` or by `role` (`"nrf52"`, `"esp32s3"`); role lookup checks `MESHTASTIC_UHUBCTL_LOCATION_` + `_PORT_` env vars first, falls back to VID auto-detection. +- **Observability** (UI tier + operator ad-hoc): `capture_screen(role, ocr=True)` — grabs a USB-webcam frame of the device OLED and optionally OCRs it. Requires `mcp-server[ui]` extras (`opencv-python-headless`, `easyocr`) and `MESHTASTIC_UI_CAMERA_DEVICE_` env var; falls through to a 1×1 black PNG `NullBackend` when unconfigured. + +`confirm=True` is a tool-level gate on top of whatever permission prompt your MCP host shows. **Don't bypass it** by asking the host to auto-approve — it exists specifically because MCP hosts sometimes remember "always allow this tool" and that's dangerous for `factory_reset`, `erase_and_flash`, `uhubctl_power(action='off')`, and `uhubctl_cycle`. + +**TCP / native-host nodes.** Setting `MESHTASTIC_MCP_TCP_HOST=` makes `list_devices` surface a `meshtasticd` daemon (e.g. the `native-macos` build) as a synthetic `tcp://host:port` entry, and `connect()` routes through `meshtastic.tcp_interface.TCPInterface` instead of `SerialInterface`. Every read/write/admin tool that flows through `connect()` works against the daemon transparently. USB-only tools (`pio_flash`, `erase_and_flash`, `update_flash`, `touch_1200bps`, `serial_open`, `esptool_*`, `nrfutil_*`, `picotool_*`) raise a clear `ConnectionError` when handed a `tcp://` port; `pio_flash` against a `native*` env raises a `FlashError` (no upload step — use `build` and run the binary directly). The pytest harness still assumes USB-attached devices per role; TCP-aware fixtures are deferred. See `mcp-server/README.md` § "TCP / native-host nodes". + +### Hardware test suite (`mcp-server/run-tests.sh`) + +The wrapper auto-detects connected devices (VID → role map: `0x239A` → `nrf52`, `0x303A`/`0x10C4` → `esp32s3`), maps each role to a PlatformIO env (`nrf52` → `rak4631`, `esp32s3` → `heltec-v3`, overridable via `MESHTASTIC_MCP_ENV_`), then invokes pytest. Zero pre-flight config needed from the operator. + +Suite tiers (collected + run in this order via `pytest_collection_modifyitems`): + +1. `tests/unit/` — pure Python (boards parse, pio wrapper, userPrefs parse, testing profile, uhubctl parser). No hardware. +2. `tests/test_00_bake.py` — flashes each detected device with current `userPrefs.jsonc` merged with the session's test profile. Has its own skip-if-already-baked check comparing region + primary channel to the session profile; skips cheaply on warm devices. +3. `tests/mesh/` — multi-device mesh: bidirectional send, broadcast delivery, direct-with-ACK, mesh formation within 60s. Parametrized `[nrf52->esp32s3]` and `[esp32s3->nrf52]`. Includes `test_peer_offline_recovery` which uses uhubctl to physically power off one peer mid-conversation (requires uhubctl; skips without). +4. `tests/telemetry/` — `DEVICE_METRICS_APP` broadcast timing. +5. `tests/monitor/` — boot-log panic check. +6. `tests/recovery/` — `uhubctl` power-cycle round-trip + NVS persistence across hard reset. Requires `uhubctl` installed and a PPPS-capable hub; entire tier auto-skips otherwise. +7. `tests/ui/` — input-broker-driven screen navigation with camera + OCR evidence. +8. `tests/fleet/` — PSK seed session isolation. +9. `tests/admin/` — channel URL roundtrip, owner persistence across reboot. +10. `tests/provisioning/` — region + modem + slot bake, admin key presence, `UNSET` region blocks TX, userPrefs survive factory reset. + +Invocation patterns: + +```bash +./mcp-server/run-tests.sh # full suite (auto-bake-if-needed) +./mcp-server/run-tests.sh --force-bake # reflash before testing +./mcp-server/run-tests.sh --assume-baked # skip bake (caller vouches for device state) +./mcp-server/run-tests.sh tests/mesh # one tier +./mcp-server/run-tests.sh tests/mesh/test_direct_with_ack.py # one file +./mcp-server/run-tests.sh -k telemetry # name filter +``` + +**No hardware detected?** The wrapper auto-narrows to `tests/unit/` only and prints `detected hub : (none)` in the pre-flight header. Agents interpreting the output should call this out explicitly — a 52-test green run without hardware is qualitatively different from a 12-unit-test green run. + +**Artifacts every run produces:** + +- `mcp-server/tests/report.html` — self-contained pytest-html. Each test gets a `Meshtastic debug` section with the tail of firmware log + device state dump. **Open this first** on failures; it's the canonical evidence source. +- `mcp-server/tests/junit.xml` — CI-parseable. +- `mcp-server/tests/reportlog.jsonl` — pytest-reportlog stream (`$report_type` keyed JSONL). Consumed by the live TUI. +- `mcp-server/tests/fwlog.jsonl` — firmware log mirror from the `meshtastic.log.line` pubsub topic. Populated by the `_firmware_log_stream` autouse session fixture. + +### Live TUI (`meshtastic-mcp-test-tui`) + +A Textual-based live view that wraps `run-tests.sh`. Tails reportlog for per-test state, streams firmware logs, polls device state at startup + post-run (gated out of the active run because `hub_devices` holds exclusive port locks). Key bindings: + +| Key | Action | +| --- | ------------------------------------------------------------------------------------------------------------ | +| `r` | re-run focused test (leaf → that node id; internal node → directory or `-k`) | +| `f` | filter tree by substring | +| `d` | failure detail modal (pulls `longrepr` + captured stdout from the reportlog) | +| `g` | export reproducer bundle (tar.gz with README, test_report.json, time-filtered fwlog, devices.json, env.json) | +| `l` | toggle firmware log pane | +| `x` | tool coverage modal | +| `c` | cross-run history sparkline | +| `q` | quit (SIGINT → SIGTERM → SIGKILL escalation, 5-s windows each) | + +Launch: + +```bash +cd mcp-server +.venv/bin/meshtastic-mcp-test-tui # full suite +.venv/bin/meshtastic-mcp-test-tui tests/mesh # args pass through to pytest +``` + +The plain CLI stays primary; the TUI is for operators who want a live dashboard. Both consume the same `run-tests.sh`. + +### Slash commands (Claude Code + Copilot) + +Three AI-assisted workflows wrap the test harness. Claude Code operators get `/test`, `/diagnose`, `/repro`; Copilot operators get `/mcp-test`, `/mcp-diagnose`, `/mcp-repro`. Bodies: + +- `.claude/commands/{test,diagnose,repro}.md` +- `.github/prompts/mcp-{test,diagnose,repro}.prompt.md` + +`.claude/commands/README.md` is the index. + +House rules for agents running these prompts: + +- **Interpret failures, don't just echo them.** Pull firmware log tails from `report.html` and classify each failure as transient / environmental / regression. Use the exact format in `.claude/commands/test.md`. +- **No destructive writes without operator approval.** Any skill that could reflash, factory-reset, or reboot a device must describe the action and stop. The operator authorizes. +- **Sequential MCP calls per port.** See above. +- **"Unknown" is a valid classification.** If evidence doesn't support a root cause, say so and list what would disambiguate. Do not invent. + +### Key fixtures (test authors + agents debugging) + +`mcp-server/tests/conftest.py` provides: + +- **`_session_userprefs`** (autouse session) — snapshots `userPrefs.jsonc` at session start, merges the session test profile via `userprefs.merge_active(test_profile)`, restores at teardown. Four layers of safety: pytest teardown + `atexit` + sidecar file (`userPrefs.jsonc.mcp-session-bak`) + startup self-heal in `run-tests.sh`. **Do not edit `userPrefs.jsonc` from inside a test.** +- **`_firmware_log_stream`** (autouse session) — subscribes to `meshtastic.log.line` pubsub on every connected `SerialInterface` and mirrors lines to `tests/fwlog.jsonl`. Drives the TUI firmware-log pane. +- **`_debug_log_buffer`** (autouse per-test) — captures last 200 firmware log lines + device state for attachment to the pytest-html `Meshtastic debug` section on failure. +- **`hub_devices`** (session) — `dict[role, SerialInterface]` with session-long exclusive port locks. Reason the TUI's device poller is gated to startup + post-run only. +- **`baked_mesh`** — parametrized mesh-pair fixture; depends on `test_00_bake`. `pytest_generate_tests` in `conftest.py` auto-generates `[nrf52->esp32s3]` and `[esp32s3->nrf52]` variants. +- **`test_profile`** — session-scoped dict: region, primary channel, admin key, PSK seed. Derived from `MESHTASTIC_MCP_SEED` (defaults to `mcp--`). + +### Firmware integration points tied to the test harness + +Two firmware changes exist specifically so the test harness works reliably. **Keep these in mind when touching related code.** + +- **`src/mesh/StreamAPI.cpp` + `StreamAPI.h`** — `emitLogRecord` uses a dedicated `fromRadioScratchLog` + `txBufLog` pair and a `concurrency::Lock streamLock`. Before this fix, `debug_log_api_enabled=true` would tear `FromRadio` protobufs on the serial transport because `emitTxBuffer` and `emitLogRecord` shared a single scratch buffer. The conftest enables the log stream session-wide; without this fix the device would corrupt its own FromRadio replies mid-session. +- **`src/mesh/PhoneAPI.cpp`** — `ToRadio` `Heartbeat(nonce=1)` triggers `nodeInfoModule->sendOurNodeInfo(NODENUM_BROADCAST, true, 0, true)` for serial clients, mirroring the pre-existing behavior for TCP/UDP clients in `PacketAPI.cpp`. The mesh tests rely on this to force a NodeInfo broadcast right after connect so the peer discovers them before the test's first assertion. + +If you're modifying `StreamAPI`, `PhoneAPI`, `NodeInfoModule`, or `userPrefs` flow, run `./mcp-server/run-tests.sh` at minimum before asking for review. + +### Recovery playbooks + +| Symptom | First check | Fix | +| --------------------------------------------------------------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `userPrefs.jsonc` dirty after test run | `git status --porcelain userPrefs.jsonc` | If non-empty, re-run `./mcp-server/run-tests.sh` once — the pre-flight self-heal restores from sidecar. If still dirty, `git checkout userPrefs.jsonc`. | +| Port busy / wedged CP2102 on macOS | `lsof /dev/cu.usbserial-0001` | Kill the holder. USB replug if the kernel still reports busy. Often a stale `pio device monitor` or zombie `meshtastic_mcp` process. | +| nRF52 appears unresponsive | `list_devices` shows VID `0x239A` but `device_info` times out | `touch_1200bps(port=...)` drops it into the DFU bootloader → `pio_flash` re-installs. | +| Device fully wedged (Guru Meditation, frozen CDC, no DFU) | `list_devices` shows the VID but every admin call times out | `uhubctl_cycle(role="nrf52", confirm=True)` hard-power-cycles the port via USB hub PPPS. `baked_single`'s auto-recovery hook does this once automatically if uhubctl is installed. Falls back to physical replug if no PPPS hub. | +| Multiple MCP server processes | `ps aux \| grep meshtastic_mcp` shows >1 | Kill all but the one your MCP host spawned. Zombies hold ports and break tests. | +| Mesh formation fails, one side sees peer but other doesn't | `/diagnose` (or `list_nodes` on both sides) | Asymmetric NodeInfo. `test_direct_with_ack` has a heal path; `/repro` it a few times. If persistent, both devices' clocks may be out of sync with their NodeInfo cooldown. | +| "role not present on hub" in skip reasons | `list_devices` | Expected if a device is unplugged. Reconnect before re-running the tier. | +| Entire `tests/recovery/` tier skipped | `command -v uhubctl` | Expected if `uhubctl` isn't on PATH. Install via `brew install uhubctl` (macOS) or `apt install uhubctl` (Debian/Ubuntu). Also skips if no hub advertises PPPS. | +| Entire `tests/ui/` tier skipped ("firmware not baked with USERPREFS_UI_TEST_LOG") | reportlog.jsonl for the skip reason | Re-run with `--force-bake` so the UI-log macro gets compiled into the fresh firmware. First run after the Round-3 landing always re-bakes. | +| `tests/ui/` runs but captures are all 1×1 black PNGs | `MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3` | Env var not set → `NullBackend`. Point a USB webcam at the heltec-v3 OLED and set the device index; `.venv/bin/python -c "import cv2; [print(i, cv2.VideoCapture(i).read()[0]) for i in range(5)]"` discovers it. | +| Tests fail only on first attempt then pass on rerun | — | State leak from a prior session. Run with `--force-bake` to reset to a known state. | + +### Never do these without asking + +- `factory_reset` — wipes node identity; regenerates PKI keypair. Mesh peers will reject old DMs until re-exchange. Legitimate only when the operator explicitly wants it. +- `erase_and_flash` — full chip erase; destroys all on-device state. +- `esptool_erase_flash` / `esptool_raw` write/erase — bypasses pio's safety chain. +- `set_config` on `lora.region` — changes regulatory domain; requires physical-location context the operator has and the agent doesn't. +- `reboot` / `shutdown` mid-test — breaks fixture invariants. +- `push -f`, `rebase -i`, `reset --hard`, or any history-rewriting git operation. +- Clicking computer-use tools on web links in Mail/Messages/PDFs — open URLs via the claude-in-chrome MCP so the extension's link-safety checks apply. + ## Resources - [Documentation](https://meshtastic.org/docs/) diff --git a/.github/prompts/mcp-diagnose.prompt.md b/.github/prompts/mcp-diagnose.prompt.md new file mode 100644 index 00000000000..1049858f8ef --- /dev/null +++ b/.github/prompts/mcp-diagnose.prompt.md @@ -0,0 +1,64 @@ +--- +mode: agent +description: Device health report via the meshtastic MCP tools (Copilot equivalent of the Claude Code /diagnose slash command) +--- + +# `/mcp-diagnose` — device health report + +Equivalent of `.claude/commands/diagnose.md`. Use when the operator asks to "check the devices", "what's the mesh looking like", "is nrf52 alive", etc. + +This prompt assumes the meshtastic MCP server is registered with your VS Code Copilot agent. If it isn't, fall back to running `./mcp-server/run-tests.sh tests/unit` plus a short `device_info` script via the terminal. + +## What to do + +1. **Enumerate hardware** via the `list_devices` MCP tool (with `include_unknown=True`). For each entry where `likely_meshtastic=True`, capture `port`, `vid`, `pid`, `description`. + +2. **Apply the operator's filter** (if any): + - No filter → every likely-meshtastic device. + - `nrf52` → `vid == 0x239a` + - `esp32s3` → `vid == 0x303a` or `vid == 0x10c4` + - A `/dev/cu.*` path → only that port. + - Anything else → substring match on port. + +3. **For each selected device, in sequence (don't parallelize — SerialInterface holds an exclusive port lock):** + - `device_info(port=

)` → `my_node_num`, `long_name`, `short_name`, `firmware_version`, `hw_model`, `region`, `num_nodes`, `primary_channel` + - `list_nodes(port=

)` → peer count, which peers have `publicKey`, SNR/RSSI distribution + - `get_config(section="lora", port=

)` → region, preset, channel_num, tx_power, hop_limit + - If anything looks off (can't connect, `num_nodes` wrong, missing `firmware_version`), open a short firmware-log window: `serial_open(port=

, env=)`, wait 3 seconds, `serial_read(session_id, max_lines=100)`, `serial_close(session_id)`. Infer env from VID (0x239a → `rak4631`, 0x303a/0x10c4 → `heltec-v3`) unless an `MESHTASTIC_MCP_ENV_` env var overrides it. + +4. **Hub health** (call once, not per-device): `uhubctl_list()` — enumerates every USB hub the host sees. Cross-reference each Meshtastic device's VID to find which hub + port it's on. Flag in the report if: + - No hub advertises `ppps=true` → `tests/recovery/` can't run; hard-recovery via `uhubctl_cycle` isn't available. + - A Meshtastic device is on a non-PPPS hub → note it; moving to a PPPS hub unlocks auto-recovery. + - `uhubctl_list` raises `ConfigError: uhubctl not found` → report as "uhubctl not installed"; don't treat as a device fault. + +5. **Render per-device report** as a compact block: + + ```text + [nrf52 @ /dev/cu.usbmodem1101] fw=2.7.23.bce2825, hw=RAK4631 + owner : Meshtastic 40eb / 40eb + region/band : US, channel 88, LONG_FAST + tx_power : 30 dBm, hop_limit=3 + peers : 1 (esp32s3 0x433c2428, pubkey ✓, SNR 6.0 / RSSI -24 dBm) + primary ch : McpTest + hub : 1-1.3 port 2 (PPPS, uhubctl-controllable) + firmware : no panics in last 3s + ``` + + Flag abnormalities inline with `⚠︎ ` — missing pubkey on a known peer, region UNSET, mismatched channel name, device on non-PPPS hub, etc. + +6. **Cross-device correlation** (when >1 device selected): + - Do both see each other in `nodesByNum`? + - Do `region`, `channel_num`, `modem_preset` match across devices? + - Do the primary channel names match? (Different name → different PSK → no decode.) + +7. **Suggest next steps only for recognizable failure modes**, never speculatively: + - Stale PKI one-way → "`/mcp-test tests/mesh/test_direct_with_ack.py` — the test's retry+nodeinfo-ping heals this." + - Region mismatch → "re-bake one side via `./mcp-server/run-tests.sh --force-bake`." + - Device unreachable, DFU reachable → `touch_1200bps(port=...)` + `pio_flash`. If not even DFU responds and the device is on a PPPS hub, escalate to `uhubctl_cycle(role=..., confirm=True)`. + - CP2102-wedged-driver on macOS → see `run-tests.sh` notes. + +## Hard constraints + +- **Read-only.** No `set_config`, no `reboot`, no `factory_reset`, no `flash`. If the operator wants mutation, they'll escalate explicitly. +- **Open/query/close per device.** Never hold multiple SerialInterfaces to the same port. The port lock is exclusive. +- **Don't infer env beyond the VID map** — if the operator has an unusual board, ask them which env to use rather than guessing. diff --git a/.github/prompts/mcp-repro.prompt.md b/.github/prompts/mcp-repro.prompt.md new file mode 100644 index 00000000000..3a7c5c3de99 --- /dev/null +++ b/.github/prompts/mcp-repro.prompt.md @@ -0,0 +1,68 @@ +--- +mode: agent +description: Re-run a specific test N times to triage flakes; diff firmware logs between passes and failures (Copilot equivalent of the Claude Code /repro slash command) +--- + +# `/mcp-repro` — flakiness triage for one test + +Equivalent of `.claude/commands/repro.md`. Use when the operator says "that one test is flaky — dig in", "repro the direct_with_ack failure", "why does X sometimes fail?". + +## What to do + +1. **Parse the operator's input** into two pieces: + - **Test identifier** — either a pytest node id (has `::` or starts with `tests/`) or a `-k`-style filter (plain substring like `direct_with_ack`). + - **Count** — integer, default `5`, cap at `20`. If the operator asks for 50, negotiate down and explain (airtime + USB wear). + +2. **Sanity-check the hub** via the `list_devices` MCP tool. If the test name references `nrf52` or `esp32s3` and the matching VID isn't present, stop and report — re-running won't help. + +3. **Loop** N times. Each iteration: + + ```bash + ./mcp-server/run-tests.sh --tb=short -p no:cacheprovider + ``` + + `-p no:cacheprovider` keeps pytest from caching anything between iterations. Capture: exit code, duration, and (on failure) the `Meshtastic debug` firmware-log section from `mcp-server/tests/report.html`. + +4. **Tally** results as you go: + + ```text + attempt 1: PASS (42s) + attempt 2: FAIL (128s) ← fw log captured + attempt 3: PASS (39s) + attempt 4: FAIL (121s) + attempt 5: PASS (41s) + -------------------------------------------------- + pass rate: 3/5 (60%) | mean duration: 74s + ``` + +5. **On mixed outcomes, diff the firmware logs** between one representative pass and one representative fail. Focus on: + - Error-level lines present only in failures (`PKI_UNKNOWN_PUBKEY`, `Alloc an err=`, `Skip send`, `No suitable channel`, `NAK`) + - Timing around the assertion point (broadcast sent? ACK received? retry fired?) + - Device-state fields that changed between attempts + + Surface the top 3 differences as a compact "passes when / fails when" table with uptime timestamps. Don't dump full logs. + +6. **Classify** the flake into one of: + - **LoRa airtime collision** — pass rate improves with fewer concurrent transmitters. Suggest a `time.sleep` gap or retry bump in the test body. + - **PKI key staleness** — first attempt fails, subsequent ones pass; existing retry-loop pattern in `test_direct_with_ack.py` is the fix. + - **NodeInfo cooldown** — `Skip send NodeInfo since we sent it <600s ago` in fail-only logs; needs a `broadcast_nodeinfo_ping()` warmup. + - **Hardware-specific** — one direction consistently fails, firmware versions differ, CP2102 driver wedged, etc. For a device wedged past `touch_1200bps`, recommend `uhubctl_cycle(role=..., confirm=True)` to hard-power-cycle its hub port (requires `uhubctl` installed). + - **Device went dark mid-run** — fails from some iteration onward and never recovers; firmware log stops arriving. Almost always a Guru crash with frozen CDC. Recommend `uhubctl_cycle` before the next iteration; escalate to replug if that also fails. + - **Unknown** — say so. Don't invent a root cause. + +7. **Report back** with: + - Pass rate + mean duration. + - Classification + the specific log evidence for it. + - A concrete next step (tighter assertion, more retries, open `/mcp-diagnose`, file a bug, nothing). + +## Examples + +- `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[esp32s3->nrf52] 10` — 10 runs of that parametrized case. +- `broadcast_delivers` — no `::`, no `tests/`; treat as `-k broadcast_delivers`; runs every match 5 times. +- `tests/telemetry/test_device_telemetry_broadcast.py 3` — shorter count for a slow test. + +## Notes + +- If the FIRST attempt fails and the rest pass, that's a state-leak signature — suggest starting from `--force-bake` or a clean device state rather than chasing the first-failure firmware logs. +- If ALL N fail, this isn't a flake — it's a regression. Say so, stop iterating, escalate to `/mcp-test` for full-suite context. +- Don't rebuild firmware during triage. Flakes that only reproduce under different firmware belong in a separate session with a plan. diff --git a/.github/prompts/mcp-test.prompt.md b/.github/prompts/mcp-test.prompt.md new file mode 100644 index 00000000000..148569e83da --- /dev/null +++ b/.github/prompts/mcp-test.prompt.md @@ -0,0 +1,57 @@ +--- +mode: agent +description: Run the mcp-server test suite and interpret results (Copilot equivalent of the Claude Code /test slash command) +--- + +# `/mcp-test` — mcp-server test runner with interpretation + +Equivalent of the Claude Code `/test` slash command in `.claude/commands/test.md`. Use this when the operator asks you to "run the tests", "check the mcp test suite", "run the mesh tests", etc. + +## What to do + +1. **Invoke the wrapper** from the firmware repo root: + + ```bash + ./mcp-server/run-tests.sh [pytest-args] + ``` + + If the operator specified a subset (e.g. "just the mesh tests"), pass it through as `tests/mesh` or a pytest `-k filter`. If they said nothing, use the wrapper's defaults (full suite with pytest-html report). + + The wrapper auto-detects connected Meshtastic devices, maps each to its PlatformIO env, exports the required env vars, and invokes pytest. Zero pre-flight config needed from the operator. + +2. **Read the pre-flight header** (first few lines of wrapper output). The `detected hub :` line lists role → port → env mappings. If it reads `(none)`, the wrapper narrowed to `tests/unit` only — call that out explicitly so the operator knows hardware tiers were skipped. + +3. **On pass**: one-line summary like `N passed, M skipped in `. Don't enumerate test names. DO mention any non-placeholder SKIPs and name the cause: + - `"role not present on hub"` → device unplugged; operator should reconnect. + - `"firmware not baked with USERPREFS_UI_TEST_LOG"` → tests/ui skipped; the UI-log compile macro isn't in the baked firmware. Suggest `--force-bake`. + - `"uhubctl not installed"` → tests/recovery + `test_peer_offline_recovery` skipped. Suggest `brew install uhubctl` / `apt install uhubctl`. + - `"no PPPS-capable hubs detected"` → tests/recovery skipped because the attached hub doesn't support per-port power switching; won't run on that setup. + - `"opencv-python-headless is not installed"` → tests/ui auto-deselected by `run-tests.sh`. Suggest `pip install -e 'mcp-server/.[ui]'`. + +4. **On failure**: open `mcp-server/tests/report.html` (pytest-html output, self-contained) and extract the `Meshtastic debug` section for each failed test. That section includes a firmware log stream (last 200 lines) and device state dump. For each failure, summarise: + - test name + - one-line assertion message + - the specific firmware log lines that explain why (look for `PKI_UNKNOWN_PUBKEY`, `Skip send NodeInfo`, `Error=`, `Guru Meditation`, `assertion failed`, `No suitable channel`) + - for UI-tier failures also check `mcp-server/tests/ui_captures///transcript.md` (per-step frame + OCR) + +5. **Classify each failure** as one of: + - **Transient flake** — LoRa collision, first-attempt NAK with self-heal pattern, timing-sensitive assertion. Suggest `/mcp-repro ` to confirm. + - **Environmental** — device unreachable, port busy, CP2102 driver wedged on macOS. Suggest recovery in escalation order: (a) replug USB, (b) `touch_1200bps` + `pio_flash` for nRF52 DFU, (c) `uhubctl_cycle(role=..., confirm=True)` for a device wedged past DFU (needs `uhubctl` installed; `baked_single` does this once automatically when available). Also check `git status userPrefs.jsonc`. + - **Regression** — same assertion fails repeatedly on re-runs, firmware log shows novel errors. Identify the firmware module likely responsible. + +6. **Do NOT run destructive recovery automatically**. If a failure looks like it needs a reflash, factory*reset, `uhubctl_cycle`, or replug — \_describe the steps* and let the operator decide. Never burn airtime or flash cycles without approval. + +## Arguments convention + +Operators generally invoke this prompt either with no arguments (full suite) or with a specific subset. Examples: + +- `tests/mesh` — one tier +- `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip` — one test +- `--force-bake` — reflash devices first +- `-k telemetry` — name-filter + +## Side-effects to confirm in your summary + +- `userPrefs.jsonc` should be clean after a successful run. The session fixture in `mcp-server/tests/conftest.py` (`_session_userprefs`) snapshots and restores. Check `git status --porcelain userPrefs.jsonc` and report if it's non-empty. +- `mcp-server/tests/report.html` and `junit.xml` regenerate on every run. +- The wrapper prints a warning if a `.mcp-session-bak` sidecar was left over from a crashed prior session and auto-restores from it — mention that if it happened. diff --git a/.github/prompts/new-module.prompt.md b/.github/prompts/new-module.prompt.md index 8569a622c55..08b2395970a 100644 --- a/.github/prompts/new-module.prompt.md +++ b/.github/prompts/new-module.prompt.md @@ -118,7 +118,7 @@ CallbackObserver statusObserver = Add test suite in `test/test_mymodule/`: -``` +```text test/ └── test_mymodule/ └── test_main.cpp diff --git a/.github/prompts/new-variant.prompt.md b/.github/prompts/new-variant.prompt.md index 1a324cea95d..666e264e0bd 100644 --- a/.github/prompts/new-variant.prompt.md +++ b/.github/prompts/new-variant.prompt.md @@ -6,7 +6,7 @@ Guide for adding a new Meshtastic hardware variant to the firmware. Create under `variants///`: -``` +```text variants/ ├── esp32/ # ESP32 ├── esp32s3/ # ESP32-S3 diff --git a/.github/workflows/build_debian_src.yml b/.github/workflows/build_debian_src.yml index d1bcd889890..8d2076b113f 100644 --- a/.github/workflows/build_debian_src.yml +++ b/.github/workflows/build_debian_src.yml @@ -32,10 +32,15 @@ jobs: shell: bash working-directory: meshtasticd run: | + # Build-tools (notably platformio) come from the Meshtastic project + # on the OpenSUSE Build Service: + # https://build.opensuse.org/project/show/network:Meshtastic:build-tools + echo 'deb http://download.opensuse.org/repositories/network:/Meshtastic:/build-tools/xUbuntu_24.04/ /' \ + | sudo tee /etc/apt/sources.list.d/network:Meshtastic:build-tools.list + curl -fsSL https://download.opensuse.org/repositories/network:Meshtastic:build-tools/xUbuntu_24.04/Release.key \ + | gpg --dearmor | sudo tee /etc/apt/trusted.gpg.d/network_Meshtastic_build-tools.gpg >/dev/null sudo apt-get update -y --fix-missing - sudo apt-get install -y software-properties-common build-essential devscripts equivs - sudo add-apt-repository ppa:meshtastic/build-tools -y - sudo apt-get update -y --fix-missing + sudo apt-get install -y build-essential devscripts equivs sudo mk-build-deps --install --remove --tool='apt-get -o Debug::pkgProblemResolver=yes --no-install-recommends --yes' debian/control - name: Import GPG key diff --git a/.github/workflows/build_macos_bin.yml b/.github/workflows/build_macos_bin.yml new file mode 100644 index 00000000000..d0e89d7da6e --- /dev/null +++ b/.github/workflows/build_macos_bin.yml @@ -0,0 +1,51 @@ +name: Build MacOS Binary + +on: + workflow_call: + inputs: + macos_ver: + required: false + default: "26" # ARM64 + type: string + +permissions: + contents: read + +jobs: + build-MacOS: + runs-on: macos-${{ inputs.macos_ver }} + steps: + - name: Checkout code + uses: actions/checkout@v6 + with: + submodules: recursive + + - name: Install deps + shell: bash + run: | + brew update + brew install platformio yaml-cpp libuv openssl@3 libusb argp-standalone pkg-config ulfius + + - name: Get release version string + run: | + echo "long=$(./bin/buildinfo.py long)" >> $GITHUB_OUTPUT + id: version + + - name: Build for MacOS + run: | + platformio run -e native-macos + env: + PKG_VERSION: ${{ steps.version.outputs.long }} + # Errors in this step should not fail the entire workflow while MacOS support is in development. + continue-on-error: true + + - name: List output files + run: ls -lah .pio/build/native-macos/ + + - name: Store binaries as an artifact + uses: actions/upload-artifact@v7 + with: + name: firmware-macos-${{ inputs.macos_ver }}-${{ steps.version.outputs.long }} + overwrite: true + path: | + .pio/build/native-macos/meshtasticd diff --git a/.github/workflows/docker_build.yml b/.github/workflows/docker_build.yml index d9b23a7e810..8a3ef0e6cd7 100644 --- a/.github/workflows/docker_build.yml +++ b/.github/workflows/docker_build.yml @@ -73,7 +73,9 @@ jobs: - name: Sanitize platform string id: sanitize_platform # Replace slashes with underscores - run: echo "cleaned_platform=${{ inputs.platform }}" | sed 's/\//_/g' >> $GITHUB_OUTPUT + env: + plat: ${{ inputs.platform }} + run: echo "cleaned_platform=${plat}" | sed 's/\//_/g' >> $GITHUB_OUTPUT - name: Docker login if: ${{ inputs.push }} diff --git a/.github/workflows/docker_manifest.yml b/.github/workflows/docker_manifest.yml index b2fd1259914..4bfdfe37e47 100644 --- a/.github/workflows/docker_manifest.yml +++ b/.github/workflows/docker_manifest.yml @@ -43,6 +43,15 @@ jobs: push: true secrets: inherit + docker-debian-riscv64: + uses: ./.github/workflows/docker_build.yml + with: + distro: debian + platform: linux/riscv64 + runs-on: ubuntu-24.04-arm + push: true + secrets: inherit + docker-alpine-amd64: uses: ./.github/workflows/docker_build.yml with: @@ -70,16 +79,27 @@ jobs: push: true secrets: inherit + docker-alpine-riscv64: + uses: ./.github/workflows/docker_build.yml + with: + distro: alpine + platform: linux/riscv64 + runs-on: ubuntu-24.04-arm + push: true + secrets: inherit + docker-manifest: needs: # Debian - docker-debian-amd64 - docker-debian-arm64 - docker-debian-armv7 + - docker-debian-riscv64 # Alpine - docker-alpine-amd64 - docker-alpine-arm64 - docker-alpine-armv7 + - docker-alpine-riscv64 runs-on: ubuntu-24.04 steps: - name: Checkout code @@ -162,6 +182,7 @@ jobs: meshtastic/meshtasticd@${{ needs.docker-debian-amd64.outputs.digest }} meshtastic/meshtasticd@${{ needs.docker-debian-arm64.outputs.digest }} meshtastic/meshtasticd@${{ needs.docker-debian-armv7.outputs.digest }} + meshtastic/meshtasticd@${{ needs.docker-debian-riscv64.outputs.digest }} - name: Docker meta (Alpine) id: meta_alpine @@ -182,3 +203,4 @@ jobs: meshtastic/meshtasticd@${{ needs.docker-alpine-amd64.outputs.digest }} meshtastic/meshtasticd@${{ needs.docker-alpine-arm64.outputs.digest }} meshtastic/meshtasticd@${{ needs.docker-alpine-armv7.outputs.digest }} + meshtastic/meshtasticd@${{ needs.docker-alpine-riscv64.outputs.digest }} diff --git a/.github/workflows/main_matrix.yml b/.github/workflows/main_matrix.yml index 88395600a71..f46bf465260 100644 --- a/.github/workflows/main_matrix.yml +++ b/.github/workflows/main_matrix.yml @@ -116,6 +116,20 @@ jobs: build_location: local secrets: inherit + MacOS: + strategy: + fail-fast: false + matrix: + macos_ver: + - "26" # ARM64 + # - '26-intel' # x86_64 + - "15" # ARM64 + # - '15-intel' # x86_64 + uses: ./.github/workflows/build_macos_bin.yml + with: + macos_ver: ${{ matrix.macos_ver }} + # secrets: inherit + package-pio-deps-native-tft: if: ${{ github.repository == 'meshtastic/firmware' && github.event_name == 'workflow_dispatch' }} uses: ./.github/workflows/package_pio_deps.yml @@ -286,6 +300,7 @@ jobs: - gather-artifacts - build-debian-src - package-pio-deps-native-tft + # - MacOS steps: - name: Checkout uses: actions/checkout@v6 @@ -318,6 +333,7 @@ jobs: prerelease: true name: Meshtastic Firmware ${{ needs.version.outputs.long }} Alpha tag_name: v${{ needs.version.outputs.long }} + target_commitish: ${{ github.sha }} body: ${{ steps.release_notes.outputs.notes }} - name: Download source deb diff --git a/.github/workflows/test_native.yml b/.github/workflows/test_native.yml index 2fabf0591ed..1e22d74d165 100644 --- a/.github/workflows/test_native.yml +++ b/.github/workflows/test_native.yml @@ -86,7 +86,13 @@ jobs: run: sed -i 's/-DBUILD_EPOCH=$UNIX_TIME/#-DBUILD_EPOCH=$UNIX_TIME/' platformio.ini - name: PlatformIO Tests - run: platformio test -e coverage -v --junit-output-path testreport.xml + run: | + set -o pipefail + # Filter out SKIPPED summary rows for hardware variants that can't run on the + # native host. They flood the log and make it harder to spot real failures. + # The JUnit XML is written directly to testreport.xml before the pipe, so + # the test artifact is unaffected. + platformio test -e coverage -v --junit-output-path testreport.xml 2>&1 | grep -v "[[:space:]]SKIPPED$" - name: Save test results if: always() # run this step even if previous step failed diff --git a/.gitignore b/.gitignore index 43cee78db73..55e90a8f28d 100644 --- a/.gitignore +++ b/.gitignore @@ -47,6 +47,10 @@ data/boot/logo.* managed_components/* arduino-lib-builder* dependencies.lock + +# JLink / RTT debug artifacts (nRF SoCs) +flash.jlink +rtt_*.txt idf_component.yml CMakeLists.txt /sdkconfig.* @@ -54,3 +58,11 @@ CMakeLists.txt # PYTHONPATH used by the Nix shell .python3 +.claude/scheduled_tasks.lock +userPrefs.jsonc.mcp-session-bak + +# Fake-NodeDB fixture pipeline (bin/regen-fake-nodedbs.sh) +# JSONL seeds are committed (test/fixtures/nodedb/seed_v25_*.jsonl); +# compiled .proto outputs are ephemeral build artifacts. +build/fixtures/ +bin/_generated/ diff --git a/.mcp.json b/.mcp.json new file mode 100644 index 00000000000..c5cf2e55e5a --- /dev/null +++ b/.mcp.json @@ -0,0 +1,11 @@ +{ + "mcpServers": { + "meshtastic": { + "command": "./mcp-server/.venv/bin/python", + "args": ["-m", "meshtastic_mcp"], + "env": { + "MESHTASTIC_FIRMWARE_ROOT": "." + } + } + } +} diff --git a/.trunk/configs/.bandit b/.trunk/configs/.bandit index d286ded8974..c70e7743b67 100644 --- a/.trunk/configs/.bandit +++ b/.trunk/configs/.bandit @@ -1,2 +1,28 @@ [bandit] -skips = B101 \ No newline at end of file +# Rule IDs: https://bandit.readthedocs.io/en/latest/plugins/index.html +# +# B101 assert_used +# pytest assertions + internal invariants; required for pytest. +# B110 try_except_pass +# best-effort cleanup paths (atexit handlers, pubsub unsubscribe, +# session-end file close, socket shutdown). Logging inside the +# except block would be worse than the silent pass — teardown is +# already at end-of-session and the surrounding caller has context. +# B112 try_except_continue +# defensive loops over flaky sources (pubsub handlers, device +# re-enumeration polls). One failed iteration shouldn't abort the loop. +# B404 import_subprocess +# mcp-server wraps PlatformIO, esptool, nrfutil, picotool, and the +# pytest test-runner — subprocess is a load-bearing import here, not +# a smell. The "consider possible security implications" advisory is +# redundant given the file-level review already applied. +# B603 subprocess_without_shell_equals_true +# all subprocess calls use a static argv list; `shell=False` is the +# default and we never string-interpolate user input into the command. +# B606 start_process_with_no_shell +# same invariant as B603 — running a binary via argv list (not +# `shell=True`) is the safe pattern bandit is asking for. +# +# Higher-severity checks (B102 exec_used, B301 pickle, B307 eval, +# B602 shell=True, etc.) remain enabled. +skips = B101,B110,B112,B404,B603,B606 \ No newline at end of file diff --git a/.trunk/trunk.yaml b/.trunk/trunk.yaml index d0cbaa8bc57..88b0f51d4d6 100644 --- a/.trunk/trunk.yaml +++ b/.trunk/trunk.yaml @@ -4,29 +4,29 @@ cli: plugins: sources: - id: trunk - ref: v1.7.6 + ref: v1.10.0 uri: https://github.com/trunk-io/plugins lint: enabled: - - checkov@3.2.517 - - renovate@43.110.9 - - prettier@3.8.1 - - trufflehog@3.94.3 + - checkov@3.2.529 + - renovate@43.150.0 + - prettier@3.8.3 + - trufflehog@3.95.3 - yamllint@1.38.0 - bandit@1.9.4 - - trivy@0.69.3 + - trivy@0.70.0 - taplo@0.10.0 - - ruff@0.15.9 + - ruff@0.15.13 - isort@8.0.1 - markdownlint@0.48.0 - - oxipng@10.1.0 + - oxipng@10.1.1 - svgo@4.0.1 - actionlint@1.7.12 - flake8@7.3.0 - hadolint@2.14.0 - shfmt@3.6.0 - shellcheck@0.11.0 - - black@26.3.1 + - black@26.5.1 - git-diff-check - gitleaks@8.30.1 - clang-format@16.0.3 @@ -34,9 +34,16 @@ lint: - linters: [ALL] paths: - bin/** + # Fake-NodeDB fixture JSONL files contain deterministic synthetic + # public_key_hex (64-char hex) values that gitleaks misidentifies as + # generic-api-key. These are not secrets — they're test fixtures + # produced by bin/gen-fake-nodedb-seed.py with a fixed RNG seed. + - linters: [gitleaks] + paths: + - test/fixtures/nodedb/seed_v25_*.jsonl runtimes: enabled: - - python@3.10.8 + - python@3.14.4 - go@1.21.0 - node@22.16.0 actions: diff --git a/.vscode/extensions.json b/.vscode/extensions.json index 080e70d08b9..66d8356e517 100644 --- a/.vscode/extensions.json +++ b/.vscode/extensions.json @@ -1,10 +1,10 @@ { - // See http://go.microsoft.com/fwlink/?LinkId=827846 - // for the documentation about the extensions.json format "recommendations": [ - "platformio.platformio-ide" + "Jason2866.esp-decoder", + "pioarduino.pioarduino-ide" ], "unwantedRecommendations": [ - "ms-vscode.cpptools-extension-pack" + "ms-vscode.cpptools-extension-pack", + "platformio.platformio-ide" ] } diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000000..82912f252f4 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,144 @@ +# Agent instructions + +This repository is the [Meshtastic](https://meshtastic.org) firmware — a C++17 embedded codebase targeting ESP32 / nRF52 / RP2040 / STM32WL / Linux-Portduino LoRa mesh radios — plus a Python MCP server in `mcp-server/` that AI agents use to flash, configure, and test connected devices. + +## Primary instruction file + +**Read `.github/copilot-instructions.md` first.** That file is the canonical agent-facing document for this repo. It covers project layout, coding conventions (naming, module framework, Observer pattern, thread safety), the build system, CI/CD, the native C++ test suite, and — most importantly for automation work — the **MCP Server & Hardware Test Harness** section. Read it top-to-bottom before starting any non-trivial change. + +This file (`AGENTS.md`) is a short pointer + quick reference for agents that don't read `.github/copilot-instructions.md` by default. + +## Quick command reference + +| Action | Command | +| -------------------------------- | ------------------------------------------------------------------------------------------------------------- | +| Build a firmware variant | `pio run -e ` (e.g. `pio run -e rak4631`, `pio run -e heltec-v3`) | +| Build native macOS host binary | `pio run -e native-macos` (Homebrew prereqs + CH341 LoRa setup in `variants/native/portduino/platformio.ini`) | +| Clean + rebuild | `pio run -e -t clean && pio run -e ` | +| Flash a device | `pio run -e -t upload --upload-port ` (or use the `pio_flash` MCP tool) | +| Run firmware unit tests (native) | `pio test -e native` | +| Run MCP hardware tests | `./mcp-server/run-tests.sh` | +| Live TUI test runner | `mcp-server/.venv/bin/meshtastic-mcp-test-tui` | +| Format before commit | `trunk fmt` | +| Regenerate protobuf bindings | `bin/regen-protos.sh` | +| Generate CI matrix | `./bin/generate_ci_matrix.py all [--level pr]` | + +## MCP server (device + test automation) + +The `mcp-server/` package exposes ~32 MCP tools for device discovery, building, flashing, serial monitoring, and live-node administration. Tools are grouped as: + +- **Discovery**: `list_devices`, `list_boards`, `get_board` +- **Build & flash**: `build`, `clean`, `pio_flash`, `erase_and_flash` (ESP32 factory), `update_flash` (ESP32 OTA), `touch_1200bps` +- **Serial sessions**: `serial_open`, `serial_read`, `serial_list`, `serial_close` +- **Device reads**: `device_info`, `list_nodes` +- **Device writes** (require `confirm=True`): `set_owner`, `get_config`, `set_config`, `get_channel_url`, `set_channel_url`, `send_text`, `reboot`, `shutdown`, `factory_reset`, `set_debug_log_api` +- **userPrefs admin**: `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest`, `userprefs_testing_profile` +- **Vendor escape hatches**: `esptool_*`, `nrfutil_*`, `picotool_*` + +Setup: `cd mcp-server && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'`. The repo registers the server via `.mcp.json` — Claude Code picks it up automatically. + +See `mcp-server/README.md` for argument shapes and the **MCP Server & Hardware Test Harness** section of `.github/copilot-instructions.md` for agent usage rules (tool surface, fixture contract, firmware integration points, recovery playbooks). + +## Slash commands (AI-assisted workflows) + +Three test-and-diagnose workflows exist as slash commands: + +- **`/test` (Claude Code) / `/mcp-test` (Copilot)** — run the hardware test suite and interpret failures +- **`/diagnose` / `/mcp-diagnose`** — read-only device health report +- **`/repro` / `/mcp-repro`** — flakiness triage: re-run one test N times, diff firmware logs between passes and failures + +Bodies live in `.claude/commands/` and `.github/prompts/` respectively. `.claude/commands/README.md` is the index. + +## Encryption at a glance + +Two layers, both in `src/mesh/CryptoEngine.cpp`: + +- **Channel (symmetric)** — **AES-CTR** with a channel-wide PSK (AES-128 or AES-256). Nonce = packet_id ‖ from_node ‖ block_counter. No AEAD; integrity is soft (channel-hash filter). The well-known default PSK lives in `src/mesh/Channels.h`; a 1-byte PSK is a short-form index into it. +- **Per-peer PKI** — **X25519 ECDH** (Curve25519, 32-byte keys) → SHA-256 → **AES-256-CCM** with an 8-byte MAC. Fresh 32-bit `extraNonce` per packet, sent in the clear alongside the MAC. 12-byte wire overhead (`MESHTASTIC_PKC_OVERHEAD`). Used for DMs. Also used for remote admin (`src/modules/AdminModule.cpp`), where AdminMessage authorization is gated by `config.security.admin_key[0..2]`. Disabled entirely in Ham mode (`user.is_licensed=true`). + +Key rotation to never trigger casually: only the **full** factory reset (`factory_reset_device`, `eraseBleBonds=true`) wipes `security.private_key` and regenerates the keypair — every peer holds the old public key, so DMs silently fail PKI decrypt until NodeInfo re-exchanges. The **partial** config reset (`factory_reset_config`) preserves the private key and doesn't invalidate peer relationships. Explicitly blanking `security.private_key` via admin also triggers regen. See the **Encryption & Key Management** section of `.github/copilot-instructions.md` for the full spec (nonce layout, send/receive selection logic including infrastructure-portnum exceptions, admin-key + session-passkey authorization, `is_managed` scope, key-rotation hazards). + +## House rules + +- **No destructive device operations without operator approval.** `factory_reset`, `erase_and_flash`, `reboot`, `shutdown`, history-rewriting git ops — describe the action and stop. Operator authorizes. +- **One MCP call per serial port at a time.** The port lock is exclusive; concurrent calls deadlock. Sequence: open → read/mutate → close, then next device. +- **`userPrefs.jsonc` is session state during tests.** The `_session_userprefs` fixture snapshots + restores it; never edit it from inside a test. +- **Don't speculate about firmware root causes.** When evidence doesn't support a classification, say "unknown" and list what would disambiguate. +- **Run `trunk fmt` before proposing a commit.** The `trunk_check` CI gate will reject unformatted code. +- **`confirm=True` on destructive MCP tools is a real gate, not a formality.** Don't bypass it via auto-approve settings. +- **Keep code comments minimal — one or two lines, max.** Comment only when the _why_ isn't obvious from the code; never restate what the next line does. No multi-paragraph block comments explaining straightforward changes. The diff and commit message carry the rationale; the code carries the behavior. +- **Use `Throttle` for time-based rate limiting, not raw `millis()` math.** `src/mesh/Throttle.h` provides `Throttle::isWithinTimespanMs(lastMs, intervalMs)` (returns true while inside the cooldown) and `Throttle::execute(&lastMs, intervalMs, func)` (function-pointer form that updates the timestamp on fire). Use these for any "did N ms pass since X" check — raw `millis() > lastMs + N` is rollover-unsafe (breaks after ~49.7 days) and inconsistent with the rest of the codebase. The helpers compute `now - lastMs` with unsigned subtraction, which wraps correctly. + +## Typical agent workflows + +### Flashing a device + +1. `list_devices` → find the port + likely VID +2. `list_boards` → confirm the env, or use the known default for the hardware +3. `pio_flash(env=..., port=..., confirm=True)` for any arch, or `erase_and_flash(env=..., port=..., confirm=True)` for an ESP32 factory install + +### Inspecting live node state + +1. `device_info(port=...)` — short summary (node num, firmware version, region, peer count) +2. `list_nodes(port=...)` — full peer table (SNR, RSSI, pubkey presence, last_heard) +3. `get_config(section="lora", port=...)` — LoRa settings for cross-device comparison + +Sequence these; don't parallelize on the same port. + +### Testing a firmware change + +1. Build locally: `pio run -e ` +2. Flash the test device: `pio_flash(env=..., port=..., confirm=True)` +3. Run the suite: `./mcp-server/run-tests.sh tests/` or `/test tests/` +4. On failure, open `mcp-server/tests/report.html` → `Meshtastic debug` section for the firmware log tail + device state dump +5. Iterate + +### Debugging a flaky test + +1. `/repro [count]` — re-runs the test N times, diffs firmware logs between passes and failures +2. If the first attempt always fails and the rest pass, that's a state-leak pattern → suggest `--force-bake` or a clean device state, don't chase the first failure +3. If all N fail, this isn't a flake — it's a regression. Stop iterating and escalate to `/test` for full-suite context. + +## Where to look + +| Path | What's there | +| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | +| `src/` | Firmware C++ source (`mesh/`, `modules/`, `platform/`, `graphics/`, `gps/`, `motion/`, `mqtt/`, …) | +| `src/mesh/` | Core: NodeDB, Router, Channels, CryptoEngine, radio interfaces, StreamAPI, PhoneAPI | +| `src/modules/` | Feature modules; `Telemetry/Sensor/` has 50+ I2C sensor drivers | +| `variants/` | 200+ hardware variant definitions (`variant.h` + `platformio.ini` per board) | +| `protobufs/` | `.proto` definitions; regenerate with `bin/regen-protos.sh` | +| `test/` | Firmware unit tests (12 suites; `pio test -e native`) | +| `mcp-server/` | Python MCP server + pytest hardware integration tests | +| `mcp-server/tests/` | Tiered pytest suite: `unit/`, `mesh/`, `telemetry/`, `monitor/`, `recovery/`, `ui/`, `fleet/`, `admin/`, `provisioning/` | +| `.claude/commands/` | Claude Code slash command bodies | +| `.github/prompts/` | Copilot prompt bodies (mirrors of the Claude Code ones) | +| `.github/copilot-instructions.md` | **Primary agent instructions — read this** | +| `.github/workflows/` | CI pipelines | +| `.mcp.json` | MCP server registration for Claude Code | + +## Recovery one-liners + +- **`userPrefs.jsonc` dirty after a test run?** Re-run `./mcp-server/run-tests.sh` once (pre-flight self-heals from the sidecar). If still dirty: `git checkout userPrefs.jsonc`. +- **nRF52 not responding?** `mcp__meshtastic__touch_1200bps(port=...)` drops it into the DFU bootloader, then `pio_flash` re-installs. +- **Device fully wedged (no DFU)?** `mcp__meshtastic__uhubctl_cycle(role="nrf52", confirm=True)` hard-power-cycles it via USB hub PPPS. Needs `uhubctl` installed (`brew install uhubctl` / `apt install uhubctl`); on Linux without udev rules, permission errors fail fast, so use `sudo uhubctl` yourself or configure udev access. +- **Port busy?** `lsof ` to find the holder. Usually a stale `pio device monitor` or zombie `meshtastic_mcp` process. Kill it. +- **Multiple MCP servers running?** `ps aux | grep meshtastic_mcp` — zombies hold ports. Kill all but the one your host spawned. +- **macOS: `LIBUSB_ERROR_BUSY` on a CH341 LoRa adapter?** A third-party WCH `CH34xVCPDriver` is claiming interface 0. Find the bundle ID with `ioreg -p IOUSB -l -w 0 | grep -B2 -A30 0x5512`, then `sudo kmutil unload -b `. Apple's bundled CH34x kext targets the CH340 UART (PID 0x7523), not the SPI bridge — it's never the culprit. + +## Environment variables (test harness) + +| Var | Purpose | +| ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `MESHTASTIC_MCP_ENV_` | Override PlatformIO env for a role (e.g. `MESHTASTIC_MCP_ENV_NRF52=rak4631-dap`). Default map: `nrf52→rak4631`, `esp32s3→heltec-v3`. | +| `MESHTASTIC_MCP_SEED` | PSK seed for the session test profile. Defaults to `mcp--`. | +| `MESHTASTIC_MCP_FLASH_LOG` | File path to tee pio/esptool/nrfutil/picotool output. `run-tests.sh` sets this to `tests/flash.log` so the TUI can stream live flash progress. | +| `MESHTASTIC_MCP_TCP_HOST` | `host` or `host:port` of a `meshtasticd` daemon (e.g. the `native-macos` build). Surfaces it in `list_devices` as `tcp://host:port` so `connect()`-based tools target it transparently. Default port 4403. | +| `MESHTASTIC_UHUBCTL_BIN` | Absolute path to `uhubctl` binary. Default: PATH lookup. | +| `MESHTASTIC_UHUBCTL_LOCATION_` | Pin a role to a specific uhubctl hub location (e.g. `1-1.3`). Wins over VID auto-detection — use when multiple devices share a VID. | +| `MESHTASTIC_UHUBCTL_PORT_` | Pin a role to a specific hub port number. Required alongside `LOCATION_`. | +| `MESHTASTIC_UI_CAMERA_BACKEND` | Camera backend for UI tier + `capture_screen` tool: `opencv` / `ffmpeg` / `null` / `auto` (default). | +| `MESHTASTIC_UI_CAMERA_DEVICE` | Generic camera device (index or path). Used by the UI tier when no per-role var is set. | +| `MESHTASTIC_UI_CAMERA_DEVICE_` | Per-role camera pinning (e.g. `MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3=0` for the OLED-bearing heltec-v3). | +| `MESHTASTIC_UI_OCR_BACKEND` | OCR engine selection: `easyocr` / `pytesseract` / `null` / `auto` (default). | +| `MESHTASTIC_UI_TUI_CAMERA` | Set to `1` to mount the live camera-feed panel in `meshtastic-mcp-test-tui`. | diff --git a/Dockerfile b/Dockerfile index e00d81658d8..ba013cb1557 100644 --- a/Dockerfile +++ b/Dockerfile @@ -3,15 +3,17 @@ # trunk-ignore-all(hadolint/DL3008): Do not pin apt package versions # trunk-ignore-all(hadolint/DL3013): Do not pin pip package versions -FROM python:3.14-slim-trixie AS builder +FROM debian:trixie AS builder ARG PIO_ENV=native ENV DEBIAN_FRONTEND=noninteractive ENV TZ=Etc/UTC # Install Dependencies ENV PIP_ROOT_USER_ACTION=ignore +ENV PIP_BREAK_SYSTEM_PACKAGES=1 RUN apt-get update && apt-get install --no-install-recommends -y \ curl wget g++ zip git ca-certificates pkg-config \ + python3-pip python3-grpc-tools \ libgpiod-dev libyaml-cpp-dev libbluetooth-dev libi2c-dev libuv1-dev \ libusb-1.0-0-dev libulfius-dev liborcania-dev libssl-dev \ libx11-dev libinput-dev libxkbcommon-x11-dev libsqlite3-dev libsdl2-dev \ diff --git a/alpine.Dockerfile b/alpine.Dockerfile index 75c9aa594d0..6d1b999e299 100644 --- a/alpine.Dockerfile +++ b/alpine.Dockerfile @@ -3,12 +3,19 @@ # trunk-ignore-all(hadolint/DL3018): Do not pin apk package versions # trunk-ignore-all(hadolint/DL3013): Do not pin pip package versions -FROM python:3.14-alpine3.22 AS builder +# Ensure the Alpine version is updated in both stages of the container! +FROM alpine:3.23 AS builder ARG PIO_ENV=native -ENV PIP_ROOT_USER_ACTION=ignore +# Enable Alpine community repository (for 'py3-grpcio-tools') +RUN echo "https://dl-cdn.alpinelinux.org/alpine/v$(cut -d. -f1,2 /etc/alpine-release)/community" >> /etc/apk/repositories + +# Install Dependencies +ENV PIP_ROOT_USER_ACTION=ignore +ENV PIP_BREAK_SYSTEM_PACKAGES=1 RUN apk --no-cache add \ bash g++ libstdc++-dev linux-headers zip git ca-certificates libbsd-dev \ + py3-pip py3-grpcio-tools \ libgpiod-dev yaml-cpp-dev bluez-dev \ libusb-dev i2c-tools-dev libuv-dev openssl-dev pkgconf argp-standalone \ libx11-dev libinput-dev libxkbcommon-dev sqlite-dev sdl2-dev \ @@ -60,4 +67,4 @@ EXPOSE 4403 CMD [ "sh", "-cx", "meshtasticd --fsdir=/var/lib/meshtasticd" ] -HEALTHCHECK NONE \ No newline at end of file +HEALTHCHECK NONE diff --git a/bin/_rewrite_proto_namespace.py b/bin/_rewrite_proto_namespace.py new file mode 100755 index 00000000000..53c0115957f --- /dev/null +++ b/bin/_rewrite_proto_namespace.py @@ -0,0 +1,64 @@ +#!/usr/bin/env python3 +"""Post-process protoc-generated Python files to live under a local namespace. + +Called by bin/regen-py-protos.sh. Walks the generated *_pb2.py files in the +target directory and rewrites every `meshtastic` reference (imports, dotted +attribute access) to use the new namespace (e.g., `meshtastic_v25`). + +Why: the .proto files declare `package meshtastic;`, so protoc emits +`from meshtastic import mesh_pb2 as ...` lines. That would shadow the PyPI +`meshtastic` package which other parts of the mcp-server depend on. Renaming +to a local namespace keeps both available. + +Usage: + _rewrite_proto_namespace.py +""" + +from __future__ import annotations + +import pathlib +import re +import sys + + +def rewrite(dir_path: pathlib.Path, new_ns: str) -> int: + # Standard protoc import forms: + # from meshtastic.X_pb2 import ... (rare, for direct symbol pulls) + # from meshtastic import X_pb2 as ... (common, the cross-file ref) + # import meshtastic.X_pb2 (also possible) + pattern_dotted_from = re.compile(r"^from meshtastic\.", re.MULTILINE) + pattern_bare_from = re.compile(r"^from meshtastic import ", re.MULTILINE) + pattern_dotted_import = re.compile(r"^import meshtastic\.", re.MULTILINE) + + count = 0 + for p in dir_path.glob("*.py"): + text = p.read_text(encoding="utf-8") + new = pattern_dotted_from.sub(f"from {new_ns}.", text) + new = pattern_bare_from.sub(f"from {new_ns} import ", new) + new = pattern_dotted_import.sub(f"import {new_ns}.", new) + # NOTE: we deliberately leave `meshtastic/X.proto` source-filename + # references inside descriptor strings alone. The descriptor pool is + # keyed by source filename (independent of Python package layout), so + # those don't collide with the PyPI package's descriptors. + if new != text: + p.write_text(new, encoding="utf-8") + count += 1 + return count + + +def main(argv: list[str]) -> int: + if len(argv) != 2: + print("usage: _rewrite_proto_namespace.py ", file=sys.stderr) + return 2 + dir_path = pathlib.Path(argv[0]) + new_ns = argv[1] + if not dir_path.is_dir(): + print(f"directory not found: {dir_path}", file=sys.stderr) + return 2 + n = rewrite(dir_path, new_ns) + print(f"rewrote {n} file(s) in {dir_path} → namespace {new_ns}", file=sys.stderr) + return 0 + + +if __name__ == "__main__": + sys.exit(main(sys.argv[1:])) diff --git a/bin/build-esp32.sh b/bin/build-esp32.sh index d07a09a1664..4e799b30a3a 100755 --- a/bin/build-esp32.sh +++ b/bin/build-esp32.sh @@ -38,4 +38,4 @@ cp bin/device-install.* $OUTDIR/ cp bin/device-update.* $OUTDIR/ echo "Copying manifest" -cp $BUILDDIR/$basename.mt.json $OUTDIR/$basename.mt.json || true +cp $BUILDDIR/$basename.mt.json $OUTDIR/$basename.mt.json diff --git a/bin/build-native.sh b/bin/build-native.sh index f35e46a8790..e34b7558093 100755 --- a/bin/build-native.sh +++ b/bin/build-native.sh @@ -31,5 +31,6 @@ basename=meshtasticd-$1-$VERSION pio pkg install --environment "$PIO_ENV" || platformioFailed pio run --environment "$PIO_ENV" || platformioFailed -cp "$BUILDDIR/meshtasticd" "$OUTDIR/meshtasticd_linux_$(uname -m)" +os_name=$(uname -s | tr '[:upper:]' '[:lower:]') +cp "$BUILDDIR/meshtasticd" "$OUTDIR/meshtasticd_${os_name}_$(uname -m)" cp bin/native-install.* $OUTDIR/ diff --git a/bin/config.d/lora-station-g3.yaml b/bin/config.d/lora-station-g3.yaml new file mode 100644 index 00000000000..79d0d7e092d --- /dev/null +++ b/bin/config.d/lora-station-g3.yaml @@ -0,0 +1,18 @@ +# Station G3 motherboard with a Raspberry Pi Zero 2W as the MCU daughterboard. +# Verify spidev / I2C device paths for your OS — they may differ. +Meta: + name: Station G3 + support: community + compatible: + - raspberry-pi + +Lora: + Module: sx1262 + IRQ: 22 # BCM pin — wiki spec + Reset: 16 # BCM pin — wiki spec + Busy: 24 # BCM pin — wiki spec + # CS: 8 # BCM 8 = SPI0 CE0 (default); uncomment only to override + DIO2_AS_RF_SWITCH: true + DIO3_TCXO_VOLTAGE: true + spidev: spidev0.0 + # SX126X_MAX_POWER: 19 # matches Station G2 firmware cap; raise carefully per PA jumper mode diff --git a/bin/gen-fake-nodedb-seed.py b/bin/gen-fake-nodedb-seed.py new file mode 100755 index 00000000000..d8cf3f4b96e --- /dev/null +++ b/bin/gen-fake-nodedb-seed.py @@ -0,0 +1,439 @@ +#!/usr/bin/env python3 +"""Deterministic seed-data generator for the fake NodeDB fixture pipeline. + +Writes a JSONL file describing N fake-but-realistic Meshtastic peers. +The output is hand-editable and committed; a sibling compile step +(bin/seed-json-to-proto.py) turns it into a binary `meshtastic_NodeDatabase` +v25 protobuf with fresh "now-relative" timestamps. + +Determinism contract: + Same --seed -> byte-identical JSONL output, regardless of wall clock. + All timestamps are stored as `*_offset_sec` (seconds before "now"); the + compile step resolves them to absolute epochs at compile time. + +Structural fields covered: + * NodeInfoLite header: num, long_name, short_name, hw_model, role, + public_key, snr, channel, hops_away, next_hop, bitfield flags + * PositionLite: lat/long Gaussian around --centroid, altitude, source + * DeviceMetrics: battery/voltage/util/uptime + * EnvironmentMetrics: temp/humidity/pressure/iaq + * StatusMessage: error_code (usually zero) + +Active-board allow-list: + hw_model values are restricted to the intersection of + (a) variants with `custom_meshtastic_support_level = 1` in + variants/*/*/platformio.ini, AND + (b) values present in the `HardwareModel` enum in mesh.proto. + See HW_MODEL_WEIGHTS below. Deprecated boards (legacy TLORA / Heltec V1-2 / + classic TBEAM / TBEAM_V0P7 / Nano G1 / etc.) and fuzzer-only sentinels + (PORTDUINO, ANDROID_SIM, DIY_V1, ...) are excluded. + +Active-role allow-list: + Excludes ROUTER_CLIENT (deprecated v2.3.15) and REPEATER (deprecated v2.7.11). +""" + +from __future__ import annotations + +import argparse +import datetime as _dt +import json +import math +import pathlib +import random +import sys + +# -------------------------------------------------------------------------- +# Active-board allow-list (intersection of tier-1 variants + HardwareModel enum). +# Refresh by running: +# for f in $(find variants -name 'platformio.ini' | xargs grep -lE 'custom_meshtastic_support_level = 1'); do +# grep custom_meshtastic_hw_model_slug $f | awk -F= '{print $2}' | tr -d ' '; +# done | sort -u | comm -12 - <(python3 -c "from meshtastic.protobuf.mesh_pb2 import HardwareModel; print('\\n'.join(HardwareModel.keys()))" | sort) +# -------------------------------------------------------------------------- +HW_MODEL_WEIGHTS: dict[str, float] = { + "HELTEC_V3": 14.0, + "T_DECK": 9.0, + "HELTEC_V4": 8.0, + "RAK4631": 8.0, + "HELTEC_MESH_POCKET": 6.0, + "TRACKER_T1000_E": 5.0, + "HELTEC_MESH_NODE_T114": 5.0, + "T_DECK_PRO": 5.0, + "LILYGO_TBEAM_S3_CORE": 4.0, + "HELTEC_WIRELESS_PAPER": 4.0, + "HELTEC_WSL_V3": 3.0, + "T_ECHO": 3.0, + "HELTEC_WIRELESS_TRACKER": 3.0, + "HELTEC_WIRELESS_TRACKER_V2": 2.0, + "HELTEC_VISION_MASTER_E290": 2.0, + "HELTEC_MESH_SOLAR": 2.0, + "SEEED_WIO_TRACKER_L1": 2.0, + "T_LORA_PAGER": 1.5, + "HELTEC_VISION_MASTER_E213": 1.5, + "T_ECHO_PLUS": 1.0, + "MUZI_BASE": 1.0, + "WISMESH_TAP_V2": 1.0, + "THINKNODE_M2": 1.0, + "THINKNODE_M5": 1.0, + "TLORA_T3_S3": 1.0, + # Long tail (uniform low weight across remaining tier-1 boards): + "HELTEC_V4_R8": 0.3, + "HELTEC_VISION_MASTER_T190": 0.3, + "HELTEC_HT62": 0.3, + "HELTEC_MESH_NODE_T096": 0.3, + "M5STACK_C6L": 0.3, + "MINI_EPAPER_S3": 0.3, + "MUZI_R1_NEO": 0.3, + "NOMADSTAR_METEOR_PRO": 0.3, + "RAK3312": 0.3, + "RAK3401": 0.3, + "SEEED_SOLAR_NODE": 0.3, + "SEEED_WIO_TRACKER_L1_EINK": 0.3, + "SENSECAP_INDICATOR": 0.3, + "TBEAM_1_WATT": 0.3, + "THINKNODE_M1": 0.3, + "THINKNODE_M3": 0.3, + "THINKNODE_M6": 0.3, + "T_ECHO_LITE": 0.3, + "WISMESH_TAG": 0.3, + "WISMESH_TAP": 0.3, + "XIAO_NRF52_KIT": 0.3, + "CROWPANEL": 0.3, +} + +# Non-deprecated roles only. +ROLE_WEIGHTS: dict[str, float] = { + "CLIENT": 75.0, + "CLIENT_MUTE": 5.0, + "ROUTER": 7.0, + "TRACKER": 3.0, + "SENSOR": 2.0, + "CLIENT_HIDDEN": 2.0, + "ROUTER_LATE": 2.0, + "CLIENT_BASE": 2.0, + "TAK": 1.0, + "TAK_TRACKER": 0.5, + "LOST_AND_FOUND": 0.5, +} + +# Name pools — 60 firsts × 60 lasts = 3600 combinations. +FIRSTS = [ + "Quick", "Brave", "Silent", "Wild", "Lone", "Bright", "Red", "Blue", + "Green", "Black", "White", "Iron", "Steel", "Copper", "Silver", "Gold", + "Stone", "River", "Forest", "Mountain", "Canyon", "Desert", "Storm", "Sky", + "Solar", "Lunar", "Dawn", "Dusk", "Misty", "Frosty", "Sunny", "Shady", + "Happy", "Sleepy", "Drowsy", "Sneaky", "Sharp", "Smooth", "Rough", "Loud", + "Soft", "Slow", "Fast", "Tall", "Short", "Old", "New", "Tiny", + "Giant", "Hidden", "Lost", "Found", "Wandering", "Roving", "Drifting", "Floating", + "Burning", "Frozen", "Whispering", "Howling", +] +LASTS = [ + "Phoenix", "Lion", "Bear", "Wolf", "Hawk", "Eagle", "Fox", "Lynx", + "Cougar", "Coyote", "Raven", "Owl", "Crow", "Falcon", "Heron", "Crane", + "Otter", "Badger", "Bison", "Elk", "Moose", "Stag", "Doe", "Hare", + "Marmot", "Mole", "Beaver", "Squirrel", "Mustang", "Bronco", "Pony", "Colt", + "Cobra", "Viper", "Mamba", "Adder", "Gecko", "Iguana", "Tortoise", "Turtle", + "Salmon", "Trout", "Bass", "Pike", "Shark", "Whale", "Dolphin", "Seal", + "Cactus", "Yucca", "Sage", "Juniper", "Pine", "Cedar", "Aspen", "Oak", + "Bluff", "Mesa", "Arroyo", "Ridge", +] + +# Brief callsign pool for licensed-looking suffixes. +CALLSIGN_PREFIXES = ["KX", "WD", "N5", "KE", "AB", "W5", "K1", "KQ", "AE", "NM"] + +# Only emojis that fit in 4 UTF-8 bytes (no variation selectors). short_name's +# nanopb max_size:5 (incl. NUL) limits content to 4 bytes. ❄️ / ☀️ would be +# 6 bytes due to U+FE0F variation selector — explicitly excluded. +EMOJI_SHORTNAMES = ["🦊", "🐺", "🦅", "🐢", "🌵", "🔥", "🌙", + "🌊", "🗻", "🌲", "🦌", "🐝", "🦂", "🦉", + "🦇", "🦋"] + +# -------------------------------------------------------------------------- +# Helpers +# -------------------------------------------------------------------------- + +NUM_RESERVED = 4 # firmware reserves 0..3 (per NodeDB constants) +NUM_MAX_EXCLUSIVE = 0x80000000 # restrict to positive int32 range for readability + + +def _weighted_choice(rng: random.Random, weights: dict[str, float]) -> str: + """Deterministic weighted pick. Uses sorted keys so dict order is fixed.""" + keys = sorted(weights.keys()) + totals = [weights[k] for k in keys] + return rng.choices(keys, weights=totals, k=1)[0] + + +def _gen_long_name(rng: random.Random, is_licensed: bool) -> str: + base = f"{rng.choice(FIRSTS)} {rng.choice(LASTS)}" + if is_licensed: + prefix = rng.choice(CALLSIGN_PREFIXES) + # Two trailing alpha chars after the digit; keep within 25 - len(base) - 1 + suffix = f" {prefix}{rng.randint(0,9)}{rng.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ')}{rng.choice('ABCDEFGHIJKLMNOPQRSTUVWXYZ')}" + # nanopb max_size:25 means C string fits 24 bytes + NUL. + if len(base) + len(suffix) <= 24: + base = base + suffix + # Hard cap to 24 chars (nanopb max_size:25 minus NUL). + return base[:24] + + +def _gen_short_name(rng: random.Random, long_name: str) -> str: + # 10% emoji-only short_name + if rng.random() < 0.10: + return rng.choice(EMOJI_SHORTNAMES) + first_char = long_name[0].upper() if long_name else "X" + alphanums = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" + return first_char + "".join(rng.choices(alphanums, k=3)) + + +def _gen_hops_away(rng: random.Random) -> int: + # Geometric-ish: 0→55%, 1→25%, 2→12%, 3→5%, 4→2%, 5+→1% + r = rng.random() + if r < 0.55: + return 0 + if r < 0.80: + return 1 + if r < 0.92: + return 2 + if r < 0.97: + return 3 + if r < 0.99: + return 4 + return rng.randint(5, 7) + + +def _gen_position( + rng: random.Random, + centroid_lat: float, + centroid_lon: float, + spread_km: float, + last_heard_offset_sec: int, +) -> dict: + # 1 deg ≈ 111 km at the equator; we use this as a flat approximation. + lat = centroid_lat + rng.gauss(0.0, spread_km / 111.0) + lon = centroid_lon + rng.gauss(0.0, spread_km / 111.0) + altitude = max(0, round(rng.gauss(1376.0, 250.0))) # T or C valley floor + relief + # Position was reported up to 300s before last_heard. + time_offset_sec = last_heard_offset_sec + rng.randint(0, 300) + return { + "latitude": round(lat, 6), + "longitude": round(lon, 6), + "altitude": altitude, + "time_offset_sec": time_offset_sec, + "location_source": "LOC_INTERNAL", + } + + +def _gen_telemetry(rng: random.Random) -> dict: + # 5% plugged-in (battery_level == 101); rest uniform [10..100]. + if rng.random() < 0.05: + battery_level = 101 + voltage = 4.20 + else: + battery_level = rng.randint(10, 100) + voltage = round(3.3 + (battery_level / 100.0) * 0.9, 3) + # Beta distributions for low/right-skewed metrics; randomly draw via gammavariate. + def _beta(a: float, b: float) -> float: + x = rng.gammavariate(a, 1.0) + y = rng.gammavariate(b, 1.0) + return x / (x + y) + channel_utilization = round(_beta(2.0, 15.0) * 100.0, 2) + air_util_tx = round(_beta(1.5, 20.0) * 10.0, 3) + uptime_seconds = int(rng.expovariate(1.0 / 86400.0)) + return { + "battery_level": battery_level, + "voltage": voltage, + "channel_utilization": channel_utilization, + "air_util_tx": air_util_tx, + "uptime_seconds": uptime_seconds, + } + + +def _gen_environment(rng: random.Random) -> dict: + return { + "temperature": round(rng.gauss(22.0, 8.0), 2), + "relative_humidity": round(min(100.0, max(0.0, rng.gauss(55.0, 20.0))), 2), + "barometric_pressure": round(rng.gauss(1013.0, 8.0), 2), + "iaq": int(min(500, max(0, round(rng.gauss(50.0, 30.0))))), + } + + +def _gen_status(rng: random.Random) -> dict: + # `StatusMessage` (mesh.proto:1445) has a single free-form `string status`. + # Most peers report a healthy short status; occasional alert string. + healthy = ["OK", "online", "active", "running", "ready", "nominal"] + alert = ["low-batt", "no-gps", "weak-signal", "rebooted", "offline-soon"] + if rng.random() < 0.92: + return {"status": rng.choice(healthy)} + return {"status": rng.choice(alert)} + + +def _gen_node( + rng: random.Random, + num: int, + centroid_lat: float, + centroid_lon: float, + spread_km: float, + coverage: dict[str, float], + last_heard_mean_sec: int, + last_heard_max_sec: int, +) -> dict: + is_licensed = rng.random() < 0.05 + long_name = _gen_long_name(rng, is_licensed) + short_name = _gen_short_name(rng, long_name) + hw_model = _weighted_choice(rng, HW_MODEL_WEIGHTS) + role = _weighted_choice(rng, ROLE_WEIGHTS) + has_public_key = rng.random() < 0.92 + public_key_hex = ( + "".join(f"{rng.randint(0,255):02x}" for _ in range(32)) if has_public_key else "" + ) + snr = round(max(-20.0, min(12.0, rng.gauss(6.0, 4.0))), 2) + channel = 0 if rng.random() < 0.90 else rng.randint(1, 7) + hops_away = _gen_hops_away(rng) + next_hop = rng.randint(0, 255) if hops_away > 0 else 0 + last_heard_offset_sec = int(min(rng.expovariate(1.0 / last_heard_mean_sec), last_heard_max_sec)) + + bitfield = { + "has_user": True, + "is_favorite": rng.random() < 0.08, + "is_muted": rng.random() < 0.03, + "via_mqtt": rng.random() < 0.12, + "is_ignored": rng.random() < 0.01, + "is_licensed": is_licensed, + "has_is_unmessagable": True, + "is_unmessagable": rng.random() < 0.02, + "is_key_manually_verified": rng.random() < 0.04, + } + + node: dict = { + "num": f"0x{num:08x}", + "long_name": long_name, + "short_name": short_name, + "hw_model": hw_model, + "role": role, + "public_key_hex": public_key_hex, + "snr": snr, + "channel": channel, + "hops_away": hops_away, + "next_hop": next_hop, + "last_heard_offset_sec": last_heard_offset_sec, + "bitfield": bitfield, + "position": ( + _gen_position(rng, centroid_lat, centroid_lon, spread_km, last_heard_offset_sec) + if rng.random() < coverage["position"] + else None + ), + "telemetry": _gen_telemetry(rng) if rng.random() < coverage["telemetry"] else None, + "environment": _gen_environment(rng) if rng.random() < coverage["environment"] else None, + "status": _gen_status(rng) if rng.random() < coverage["status"] else None, + } + return node + + +def _parse_my_node_num(s: str | None) -> int | None: + if s is None: + return None + s = s.strip() + if s.startswith("0x") or s.startswith("0X"): + return int(s, 16) + return int(s) + + +def main(argv: list[str]) -> int: + p = argparse.ArgumentParser( + description="Deterministic JSONL seed for the fake NodeDB fixture.", + formatter_class=argparse.ArgumentDefaultsHelpFormatter, + ) + p.add_argument("--count", type=int, required=True, help="Number of fake nodes to emit.") + p.add_argument("--seed", type=int, required=True, help="Deterministic seed.") + p.add_argument("--out", required=True, help="Output JSONL path.") + p.add_argument( + "--centroid", + default="33.1284,-107.2528", + help="LAT,LON centroid (default: Truth or Consequences, NM).", + ) + p.add_argument("--spread-km", type=float, default=60.0, help="Gaussian std-dev in km.") + p.add_argument("--position-coverage", type=float, default=0.85) + p.add_argument("--telemetry-coverage", type=float, default=0.70) + p.add_argument("--environment-coverage", type=float, default=0.25) + p.add_argument("--status-coverage", type=float, default=0.40) + p.add_argument("--my-node-num", default=None, help="Exclude this NodeNum from generated set (hex or dec).") + p.add_argument("--last-heard-mean-sec", type=int, default=3600) + p.add_argument("--last-heard-max-sec", type=int, default=7 * 86400) + args = p.parse_args(argv) + + if args.count <= 0: + print("--count must be positive", file=sys.stderr) + return 2 + + try: + centroid_lat, centroid_lon = (float(s) for s in args.centroid.split(",")) + except ValueError: + print(f"--centroid must be LAT,LON; got {args.centroid!r}", file=sys.stderr) + return 2 + + my_node_num = _parse_my_node_num(args.my_node_num) + + rng = random.Random(args.seed) + + # 1) Generate a unique deterministic set of NodeNums. + nums: set[int] = set() + while len(nums) < args.count: + n = rng.randrange(NUM_RESERVED, NUM_MAX_EXCLUSIVE) + if my_node_num is not None and n == my_node_num: + continue + nums.add(n) + ordered_nums = sorted(nums) # sort to fix output order independent of set hash + + # 2) Per-node generation (in num order, single RNG continues). + coverage = { + "position": args.position_coverage, + "telemetry": args.telemetry_coverage, + "environment": args.environment_coverage, + "status": args.status_coverage, + } + nodes = [ + _gen_node( + rng, + n, + centroid_lat, + centroid_lon, + args.spread_km, + coverage, + args.last_heard_mean_sec, + args.last_heard_max_sec, + ) + for n in ordered_nums + ] + + # 3) Write JSONL. + out_path = pathlib.Path(args.out) + out_path.parent.mkdir(parents=True, exist_ok=True) + # `generated_at_iso` is informational; it does NOT affect determinism because + # we derive it from the seed, not from wall clock. (Same seed -> same string.) + generated_at = _dt.datetime.fromtimestamp(args.seed, tz=_dt.timezone.utc).isoformat().replace("+00:00", "Z") + meta = { + "_meta": { + "version": 25, + "seed": args.seed, + "count": args.count, + "centroid": [centroid_lat, centroid_lon], + "spread_km": args.spread_km, + "generated_at_iso": generated_at, + "my_node_num_excluded": (None if my_node_num is None else f"0x{my_node_num:08x}"), + "coverage": coverage, + "last_heard_mean_sec": args.last_heard_mean_sec, + "last_heard_max_sec": args.last_heard_max_sec, + } + } + with out_path.open("w", encoding="utf-8") as f: + # `ensure_ascii=False` so emoji short_names survive. `sort_keys=True` for + # determinism (insertion order varies by Python version otherwise). + f.write(json.dumps(meta, ensure_ascii=False, sort_keys=True) + "\n") + for node in nodes: + f.write(json.dumps(node, ensure_ascii=False, sort_keys=True) + "\n") + + print(f"wrote {args.count} nodes to {out_path} ({out_path.stat().st_size} bytes)", file=sys.stderr) + return 0 + + +if __name__ == "__main__": + sys.exit(main(sys.argv[1:])) diff --git a/bin/org.meshtastic.meshtasticd.metainfo.xml b/bin/org.meshtastic.meshtasticd.metainfo.xml index a1690186b19..ed5338af647 100644 --- a/bin/org.meshtastic.meshtasticd.metainfo.xml +++ b/bin/org.meshtastic.meshtasticd.metainfo.xml @@ -87,6 +87,9 @@ + + https://github.com/meshtastic/firmware/releases?q=tag%3Av2.7.24 + https://github.com/meshtastic/firmware/releases?q=tag%3Av2.7.23 diff --git a/bin/platformio-custom.py b/bin/platformio-custom.py index b75c666241c..f1946770c90 100644 --- a/bin/platformio-custom.py +++ b/bin/platformio-custom.py @@ -293,9 +293,12 @@ def load_boot_logo(source, target, env): board_arch = infer_architecture(env.BoardConfig()) should_skip_manifest = board_arch is None -# For host/native envs, avoid depending on 'buildprog' (some targets don't define it) -mtjson_deps = [] if should_skip_manifest else ["buildprog"] -if not should_skip_manifest and platform.name == "espressif32": +# Most platforms can generate the manifest as part of the default 'buildprog' target. +# Typically this passes success/failure properly. +mtjson_deps = ["buildprog"] +if platform.name == "espressif32": + # On ESP32, we need to explicitly depend upon the binary to prevent fake-success upon failure. + mtjson_deps = ["$BUILD_DIR/${PROGNAME}.bin"] # Build littlefs image as part of mtjson target # Equivalent to `pio run -t buildfs` target_lfs = env.DataToBin( @@ -309,7 +312,8 @@ def skip_manifest(source, target, env): env.AddCustomTarget( name="mtjson", - dependencies=mtjson_deps, + # For host/native envs, avoid depending on 'buildprog' (some targets don't define it) + dependencies=[], actions=[skip_manifest], title="Meshtastic Manifest (skipped)", description="mtjson generation is skipped for native environments", diff --git a/bin/regen-fake-nodedbs.sh b/bin/regen-fake-nodedbs.sh new file mode 100755 index 00000000000..fd92daa0247 --- /dev/null +++ b/bin/regen-fake-nodedbs.sh @@ -0,0 +1,73 @@ +#!/usr/bin/env bash +# Regenerate the fake-NodeDB fixtures: produces 250 / 500 / 1000 / 2000-node +# JSONL seed files + their compiled v25 protobufs. +# +# Layout: +# test/fixtures/nodedb/seed_v25_.jsonl — COMMITTED, hand-editable. +# build/fixtures/nodedb/nodes_v25_.proto — .gitignored, build artifact. +# Drop into /prefs/nodes.proto. +# +# Daily use: ./bin/regen-fake-nodedbs.sh +# - Recompiles protos from committed seeds (fresh wall-clock timestamps). +# Intentional seed bump: REGEN_SEEDS=yes ./bin/regen-fake-nodedbs.sh +# - Overwrites the committed JSONL files with freshly-seeded data. + +set -euo pipefail +cd "$(dirname "$0")/.." + +# 1) Make sure the Python protobuf bindings exist (in-tree generation; .gitignored). +if [[ ! -d bin/_generated/meshtastic ]]; then + echo "regenerating Python protobuf bindings (one-time)..." + ./bin/regen-py-protos.sh +fi + +# 2) Pick a Python interpreter that has the meshtastic deps installed. +# Prefer the mcp-server venv (most likely to be set up by the operator). +PY="python3" +for cand in mcp-server/.venv/bin/python3 .venv/bin/python3; do + if [[ -x "$cand" ]]; then + PY="$cand" + break + fi +done + +# 3) Pinned seeds per size — bump only when you intentionally want different +# structural data committed. Parallel arrays so the script works on +# macOS bash 3.2 (no `declare -A`). +SIZES=(250 500 1000 2000) +SEEDS=(20260511 20260512 20260513 20260514) + +REGEN_SEEDS="${REGEN_SEEDS:-no}" + +mkdir -p build/fixtures/nodedb test/fixtures/nodedb + +for i in 0 1 2 3; do + n="${SIZES[$i]}" + seed="${SEEDS[$i]}" + jsonl=$(printf "test/fixtures/nodedb/seed_v25_%04d.jsonl" "$n") + proto=$(printf "build/fixtures/nodedb/nodes_v25_%04d.proto" "$n") + + if [[ "$REGEN_SEEDS" == "yes" || ! -f "$jsonl" ]]; then + $PY bin/gen-fake-nodedb-seed.py \ + --count "$n" \ + --seed "$seed" \ + --out "$jsonl" \ + --centroid 33.1284,-107.2528 \ + --spread-km 60 \ + --position-coverage 0.85 \ + --telemetry-coverage 0.70 \ + --environment-coverage 0.25 \ + --status-coverage 0.40 + echo " seed: $jsonl ($(wc -c < "$jsonl") bytes)" + fi + + $PY bin/seed-json-to-proto.py --in "$jsonl" --out "$proto" + echo " proto: $proto ($(wc -c < "$proto") bytes)" +done + +echo "" +echo "Done. To load on Portduino native:" +echo " cp build/fixtures/nodedb/nodes_v25_1000.proto ~/.portduino/default/prefs/nodes.proto" +echo "" +echo "To push to a hardware device:" +echo " Use the mcp-server tool: push_fake_nodedb(size=1000, target=\"hardware\", port=\"/dev/cu.usbmodemXXXX\", confirm=True)" diff --git a/bin/regen-py-protos.sh b/bin/regen-py-protos.sh new file mode 100755 index 00000000000..5edad232513 --- /dev/null +++ b/bin/regen-py-protos.sh @@ -0,0 +1,51 @@ +#!/usr/bin/env bash +# Regenerate Python protobuf bindings from the in-tree `protobufs/` submodule +# into `bin/_generated/`. Called by bin/regen-fake-nodedbs.sh; also useful as +# a standalone refresh after any change to a .proto file. +# +# Output is .gitignored — bindings are a build artifact. +# +# Namespace rewrite: +# The .proto files declare `package meshtastic;`, which makes protoc emit +# imports like `from meshtastic import mesh_pb2`. That conflicts with the +# PyPI `meshtastic` package (which the mcp-server relies on for its +# SerialInterface/BLEInterface transport). We post-process the generated +# files to live under `meshtastic_v25` instead — both the directory layout +# and all internal imports — so they coexist cleanly with the PyPI package. + +set -euo pipefail +cd "$(dirname "$0")/.." + +if ! command -v protoc >/dev/null 2>&1; then + echo "ERROR: protoc not found in PATH." >&2 + echo " macOS: brew install protobuf" >&2 + echo " Ubuntu/Debian: apt install protobuf-compiler" >&2 + exit 1 +fi + +OUT=bin/_generated +LOCAL_NS=meshtastic_v25 + +rm -rf "$OUT" +mkdir -p "$OUT" + +# 1) Generate from the in-tree protos. nanopb.proto first so its descriptor +# is available for the [(nanopb).*] options on other messages. +protoc \ + --proto_path=protobufs \ + --python_out="$OUT" \ + protobufs/nanopb.proto \ + protobufs/meshtastic/*.proto + +# 2) Move the generated `meshtastic/` directory to `meshtastic_v25/`. +mv "$OUT/meshtastic" "$OUT/$LOCAL_NS" + +# 3) Rewrite internal imports: any reference to `meshtastic.X_pb2` or +# `from meshtastic import X_pb2` becomes `meshtastic_v25.*`. +python3 bin/_rewrite_proto_namespace.py "$OUT/$LOCAL_NS" "$LOCAL_NS" + +# 4) Make the package importable. +touch "$OUT/__init__.py" +touch "$OUT/$LOCAL_NS/__init__.py" + +echo "regenerated Python protobuf bindings -> $OUT/$LOCAL_NS/ (namespace: $LOCAL_NS)" >&2 diff --git a/bin/seed-json-to-proto.py b/bin/seed-json-to-proto.py new file mode 100755 index 00000000000..80eb34d63b5 --- /dev/null +++ b/bin/seed-json-to-proto.py @@ -0,0 +1,342 @@ +#!/usr/bin/env python3 +"""Compile a committed seed JSONL into a binary meshtastic_NodeDatabase v25 proto. + +The input is produced by `bin/gen-fake-nodedb-seed.py`. Timestamps in the JSONL +are stored as `*_offset_sec` (seconds before "now"); this script resolves them +to absolute epochs using `--now-epoch` (default: current wall clock). + +Output is a raw `pb_encode`-compatible binary that can be dropped at +`/prefs/nodes.proto` on the device (Portduino prefs dir or hardware via +XModem) and loaded by `NodeDB::loadFromDisk` at boot. + +Wire format reference: + protobufs/meshtastic/deviceonly.proto (NodeDatabase, NodeInfoLite, sat entries) + src/mesh/NodeDB.h:467-484 (bitfield bit positions) + src/mesh/NodeDB.cpp:1523-1524 (pb_decode entry point) +""" + +from __future__ import annotations + +import argparse +import json +import pathlib +import sys +import time +from typing import Any + +# Prefer the in-tree generated Python protobuf bindings (bin/_generated/meshtastic_v25/) +# because the firmware branch's protos (v25 NodeDatabase satellite arrays, slim +# NodeInfoLite) are typically newer than what the PyPI `meshtastic` package +# ships. Run `bin/regen-py-protos.sh` to (re)generate. +# +# Namespace note: the local bindings live under `meshtastic_v25` (NOT `meshtastic`) +# to avoid shadowing the PyPI `meshtastic` package — bin/regen-py-protos.sh +# post-processes the protoc output to rename the package. +_HERE = pathlib.Path(__file__).resolve().parent +_LOCAL_PROTO_DIR = _HERE / "_generated" +if _LOCAL_PROTO_DIR.is_dir(): + sys.path.insert(0, str(_LOCAL_PROTO_DIR)) + +try: + from meshtastic_v25.deviceonly_pb2 import ( # type: ignore[import-not-found] + NodeDatabase, + NodeInfoLite, + NodePositionEntry, + NodeTelemetryEntry, + NodeEnvironmentEntry, + NodeStatusEntry, + PositionLite, + ) + from meshtastic_v25.mesh_pb2 import HardwareModel, Position, StatusMessage # type: ignore[import-not-found] + from meshtastic_v25.config_pb2 import Config # type: ignore[import-not-found] + from meshtastic_v25.telemetry_pb2 import DeviceMetrics, EnvironmentMetrics # type: ignore[import-not-found] +except ImportError as local_err: + # Fall back to the PyPI package if in-tree bindings haven't been generated. + # Will fail the v25 assertion below if the PyPI package predates the + # satellite-DB schema, but at least gives a clear "run regen-py-protos.sh" + # error message instead of an opaque ImportError. + try: + from meshtastic.protobuf.deviceonly_pb2 import ( + NodeDatabase, + NodeInfoLite, + NodePositionEntry, + NodeTelemetryEntry, + NodeEnvironmentEntry, + NodeStatusEntry, + PositionLite, + ) + from meshtastic.protobuf.mesh_pb2 import HardwareModel, Position, StatusMessage + from meshtastic.protobuf.config_pb2 import Config + from meshtastic.protobuf.telemetry_pb2 import DeviceMetrics, EnvironmentMetrics + except ImportError as pypi_err: + print( + "ERROR: could not import meshtastic protobuf bindings.\n" + " In-tree generation: run `bin/regen-py-protos.sh` (requires protoc).\n" + " PyPI fallback: `pip install meshtastic` (may lag firmware branch).\n" + f" local error (meshtastic_v25): {local_err}\n" + f" pypi error (meshtastic.protobuf): {pypi_err}", + file=sys.stderr, + ) + sys.exit(1) + +# Fail loudly if bindings predate v25 (no satellite arrays). +assert ( + hasattr(NodeDatabase, "DESCRIPTOR") + and "positions" in NodeDatabase.DESCRIPTOR.fields_by_name +), ( + "Loaded meshtastic bindings are older than v25 (NodeDatabase.positions missing). " + "Run `bin/regen-py-protos.sh` against the in-tree protobufs/ submodule." +) + +# --------------------------------------------------------------------------- +# Bitfield bit positions (mirror src/mesh/NodeDB.h:467-484). +# --------------------------------------------------------------------------- +BIT_IS_KEY_MANUALLY_VERIFIED = 0 +BIT_IS_MUTED = 1 +BIT_VIA_MQTT = 2 +BIT_IS_FAVORITE = 3 +BIT_IS_IGNORED = 4 +BIT_HAS_USER = 5 +BIT_IS_LICENSED = 6 +BIT_IS_UNMESSAGABLE = 7 +BIT_HAS_IS_UNMESSAGABLE = 8 + +BITFIELD_LAYOUT = ( + # JSON key bit position + ("is_key_manually_verified", BIT_IS_KEY_MANUALLY_VERIFIED), + ("is_muted", BIT_IS_MUTED), + ("via_mqtt", BIT_VIA_MQTT), + ("is_favorite", BIT_IS_FAVORITE), + ("is_ignored", BIT_IS_IGNORED), + ("has_user", BIT_HAS_USER), + ("is_licensed", BIT_IS_LICENSED), + ("is_unmessagable", BIT_IS_UNMESSAGABLE), + ("has_is_unmessagable", BIT_HAS_IS_UNMESSAGABLE), +) + + +def _pack_bitfield(bf: dict[str, bool]) -> int: + out = 0 + for key, shift in BITFIELD_LAYOUT: + if bf.get(key, False): + out |= (1 << shift) + return out + + +def _validate_node(node: dict[str, Any]) -> None: + """Friendly errors so hand-editors get clear feedback.""" + if "num" not in node or not isinstance(node["num"], str): + raise ValueError(f"node missing/invalid 'num' (must be hex string): {node!r}") + if "long_name" not in node: + raise ValueError(f"node {node['num']}: missing 'long_name'") + if len(node["long_name"]) > 24: + raise ValueError( + f"node {node['num']}: long_name {node['long_name']!r} is " + f"{len(node['long_name'])} chars; max 24 (nanopb max_size:25 minus NUL)" + ) + if "short_name" in node: + # short_name max_size:5 (incl. NUL) → 4 bytes of content. + # Char count is irrelevant — emojis with variation selectors (e.g. ❄️ = 6 B) + # would slip past a `len(str) > 4` check. Always measure bytes. + b = node["short_name"].encode("utf-8") + if len(b) > 4: + raise ValueError( + f"node {node['num']}: short_name {node['short_name']!r} is " + f"{len(b)} bytes UTF-8; max 4 (nanopb max_size:5 minus NUL)" + ) + pk = node.get("public_key_hex", "") + if pk and len(pk) != 64: + raise ValueError( + f"node {node['num']}: public_key_hex must be 64 hex chars or empty; " + f"got {len(pk)} chars" + ) + if pk: + try: + bytes.fromhex(pk) + except ValueError as e: + raise ValueError(f"node {node['num']}: public_key_hex is not valid hex: {e}") + + +def _resolve_time( + node: dict[str, Any], + field_absolute: str, + field_offset: str, + now_epoch: int, +) -> int: + """If `field_absolute` is set, use it; else compute `now_epoch - offset`.""" + if field_absolute in node and node[field_absolute] is not None: + return int(node[field_absolute]) + offset = node.get(field_offset, 0) + return max(0, int(now_epoch) - int(offset)) + + +def _build_node_info_lite(node: dict[str, Any], now_epoch: int) -> NodeInfoLite: + _validate_node(node) + info = NodeInfoLite() + info.num = int(node["num"], 16) if isinstance(node["num"], str) else int(node["num"]) + info.long_name = node.get("long_name", "") + info.short_name = node.get("short_name", "") + # Enum lookups will raise ValueError on unknown names — that's exactly what we want. + info.hw_model = HardwareModel.Value(node.get("hw_model", "UNSET")) + info.role = Config.DeviceConfig.Role.Value(node.get("role", "CLIENT")) + pk_hex = node.get("public_key_hex", "") + if pk_hex: + info.public_key = bytes.fromhex(pk_hex) + info.snr = float(node.get("snr", 0.0)) + info.channel = int(node.get("channel", 0)) + if "hops_away" in node: + # `optional uint32 hops_away = 9;` — in Python protobuf, assigning the + # field implicitly sets HasField("hops_away") to True. No has_hops_away + # setter exists (unlike the C++ nanopb-generated header). + info.hops_away = int(node["hops_away"]) + info.next_hop = int(node.get("next_hop", 0)) + info.last_heard = _resolve_time(node, "last_heard", "last_heard_offset_sec", now_epoch) + info.bitfield = _pack_bitfield(node.get("bitfield", {})) + return info + + +def _build_position_entry(num: int, pos: dict[str, Any], now_epoch: int) -> NodePositionEntry: + entry = NodePositionEntry() + entry.num = num + pl = PositionLite() + # Firmware stores lat/long as int32 in 1e-7 degrees. + pl.latitude_i = int(round(float(pos["latitude"]) * 1e7)) + pl.longitude_i = int(round(float(pos["longitude"]) * 1e7)) + pl.altitude = int(pos.get("altitude", 0)) + pl.time = _resolve_time(pos, "time", "time_offset_sec", now_epoch) + pl.location_source = Position.LocSource.Value(pos.get("location_source", "LOC_UNSET")) + entry.position.CopyFrom(pl) + return entry + + +def _build_telemetry_entry(num: int, tel: dict[str, Any]) -> NodeTelemetryEntry: + entry = NodeTelemetryEntry() + entry.num = num + dm = DeviceMetrics() + if "battery_level" in tel: + dm.battery_level = int(tel["battery_level"]) + if "voltage" in tel: + dm.voltage = float(tel["voltage"]) + if "channel_utilization" in tel: + dm.channel_utilization = float(tel["channel_utilization"]) + if "air_util_tx" in tel: + dm.air_util_tx = float(tel["air_util_tx"]) + if "uptime_seconds" in tel: + dm.uptime_seconds = int(tel["uptime_seconds"]) + entry.device_metrics.CopyFrom(dm) + return entry + + +def _build_environment_entry(num: int, env: dict[str, Any]) -> NodeEnvironmentEntry: + entry = NodeEnvironmentEntry() + entry.num = num + em = EnvironmentMetrics() + if "temperature" in env: + em.temperature = float(env["temperature"]) + if "relative_humidity" in env: + em.relative_humidity = float(env["relative_humidity"]) + if "barometric_pressure" in env: + em.barometric_pressure = float(env["barometric_pressure"]) + if "iaq" in env: + em.iaq = int(env["iaq"]) + entry.environment_metrics.CopyFrom(em) + return entry + + +def _build_status_entry(num: int, status: dict[str, Any]) -> NodeStatusEntry: + # `StatusMessage` (mesh.proto:1445) has a single `string status` field. + entry = NodeStatusEntry() + entry.num = num + sm = StatusMessage() + if "status" in status: + sm.status = str(status["status"]) + entry.status.CopyFrom(sm) + return entry + + +def compile_jsonl_to_proto(jsonl_path: pathlib.Path, now_epoch: int) -> bytes: + """Read a seed JSONL and return the encoded NodeDatabase bytes.""" + lines = jsonl_path.read_text(encoding="utf-8").splitlines() + if not lines: + raise ValueError(f"{jsonl_path} is empty") + meta_line = lines[0] + meta_obj = json.loads(meta_line) + meta = meta_obj.get("_meta", {}) + version = meta.get("version") + if version != 25: + raise ValueError( + f"{jsonl_path}: meta version is {version!r}; this compiler " + f"requires version=25. Regenerate the seed with the matching tooling." + ) + + db = NodeDatabase() + db.version = 25 + + for ln, raw in enumerate(lines[1:], start=2): + raw = raw.strip() + if not raw: + continue + try: + node = json.loads(raw) + except json.JSONDecodeError as e: + raise ValueError(f"{jsonl_path}:{ln} JSON parse error: {e}") + + num = int(node["num"], 16) if isinstance(node["num"], str) else int(node["num"]) + + # Header + info = _build_node_info_lite(node, now_epoch) + db.nodes.append(info) + + # Satellites (nullable) + if node.get("position"): + db.positions.append(_build_position_entry(num, node["position"], now_epoch)) + if node.get("telemetry"): + db.telemetry.append(_build_telemetry_entry(num, node["telemetry"])) + if node.get("environment"): + db.environment.append(_build_environment_entry(num, node["environment"])) + if node.get("status"): + db.status.append(_build_status_entry(num, node["status"])) + + return db.SerializeToString() + + +def main(argv: list[str]) -> int: + p = argparse.ArgumentParser( + description="Compile a seed JSONL into a binary v25 NodeDatabase proto.", + formatter_class=argparse.ArgumentDefaultsHelpFormatter, + ) + p.add_argument("--in", dest="in_path", required=True, help="Input seed JSONL.") + p.add_argument("--out", required=True, help="Output binary .proto path.") + p.add_argument( + "--now-epoch", + type=int, + default=None, + help="Pin 'now' to this Unix epoch (for byte-identical CI). Default: time.time().", + ) + args = p.parse_args(argv) + + in_path = pathlib.Path(args.in_path) + if not in_path.is_file(): + print(f"input not found: {in_path}", file=sys.stderr) + return 2 + + now_epoch = args.now_epoch if args.now_epoch is not None else int(time.time()) + + try: + encoded = compile_jsonl_to_proto(in_path, now_epoch) + except ValueError as e: + print(f"ERROR: {e}", file=sys.stderr) + return 3 + + out_path = pathlib.Path(args.out) + out_path.parent.mkdir(parents=True, exist_ok=True) + out_path.write_bytes(encoded) + print( + f"compiled {in_path} -> {out_path} ({len(encoded)} bytes, now_epoch={now_epoch})", + file=sys.stderr, + ) + return 0 + + +if __name__ == "__main__": + sys.exit(main(sys.argv[1:])) diff --git a/bin/show-unmerged-prs.sh b/bin/show-unmerged-prs.sh new file mode 100755 index 00000000000..2a76f63d604 --- /dev/null +++ b/bin/show-unmerged-prs.sh @@ -0,0 +1,118 @@ +#!/bin/bash + +# Script to show commits in develop that are not in master +# with their associated PR info and commit hashes +# +# Usage: +# ./show-unmerged-prs.sh # Show all unmerged commits +# ./show-unmerged-prs.sh --bugfix # Show only bugfix-labeled PRs + +set -e + +REPO="firmware" +OWNER="meshtastic" +BASE_BRANCH="master" +HEAD_BRANCH="develop" +LIMIT=100 +FILTER_LABEL="" + +# Parse arguments +for arg in "$@"; do + case $arg in + --bugfix) + FILTER_LABEL="bugfix" + shift + ;; + --feature) + FILTER_LABEL="feature" + shift + ;; + --help) + echo "Usage: $0 [OPTIONS]" + echo "Options:" + echo " --bugfix Show only PRs labeled with 'bugfix'" + echo " --feature Show only PRs labeled with 'feature'" + echo " --help Show this help message" + exit 0 + ;; + esac +done + +if [ -n "$FILTER_LABEL" ]; then + echo "Fetching commits in $HEAD_BRANCH that are not in $BASE_BRANCH (filtered by label: $FILTER_LABEL)..." +else + echo "Fetching commits in $HEAD_BRANCH that are not in $BASE_BRANCH..." +fi +echo "" + +# Check if gh CLI is available +if ! command -v gh &> /dev/null; then + echo "ERROR: GitHub CLI (gh) not found. Please install it first." + echo "Visit: https://cli.github.com/" + exit 1 +fi + +# Get commits in develop that are not in master +# For each commit, try to find associated PR +git fetch origin develop master 2>/dev/null || true + +# Use git to get the list of commits +commits=$(git log --pretty=format:"%H|%s" origin/master..origin/develop | head -n $LIMIT) + +count=0 +displayed=0 +echo "Commits in $HEAD_BRANCH not in $BASE_BRANCH:" +echo "==============================================" +echo "" + +while IFS='|' read -r hash subject; do + ((count++)) + + # Try to find the PR for this commit + # Extract PR number, title, description, and labels + pr_response=$(gh api -X GET "/repos/$OWNER/$REPO/commits/$hash/pulls" \ + -H "Accept: application/vnd.github.v3+json" 2>/dev/null | \ + jq -r '.[0] | "\(.number)|\(.title)|\(.body // "No description")|\(.labels | map(.name) | join(","))"' 2>/dev/null || echo "||||") + + if [ -z "$pr_response" ] || [ "$pr_response" = "||||" ]; then + # If no PR found, skip if filter is active, otherwise show the commit + if [ -z "$FILTER_LABEL" ]; then + ((displayed++)) + echo "[$displayed] Commit: $hash" + echo " Subject: $subject" + echo " PR: Not found in GitHub" + echo "" + fi + else + IFS='|' read -r pr_num pr_title pr_desc pr_labels <<< "$pr_response" + + # Check if filter matches + if [ -n "$FILTER_LABEL" ]; then + # Only show if the label is in the labels list + if ! echo "$pr_labels" | grep -q "$FILTER_LABEL"; then + continue + fi + fi + + ((displayed++)) + echo "[$displayed] PR #$pr_num - $pr_title" + echo " Commit: $hash" + if [ -n "$pr_desc" ] && [ "$pr_desc" != "No description" ]; then + # Truncate description to 200 chars + desc_short="${pr_desc:0:200}" + [ ${#pr_desc} -gt 200 ] && desc_short+="..." + echo " Description: $desc_short" + fi + if [ -n "$pr_labels" ] && [ "$pr_labels" != "" ]; then + echo " Labels: $pr_labels" + fi + echo "" + fi +done <<< "$commits" + +echo "" +if [ -n "$FILTER_LABEL" ]; then + echo "Done. Showing $displayed PRs with label '$FILTER_LABEL' from $displayed commits checked." +else + echo "Done. Showing $displayed commits from $HEAD_BRANCH not in $BASE_BRANCH." +fi diff --git a/boards/CDEBYTE_EoRa-S3.json b/boards/CDEBYTE_EoRa-S3.json index afaabc5a7e9..2355cecd3ab 100644 --- a/boards/CDEBYTE_EoRa-S3.json +++ b/boards/CDEBYTE_EoRa-S3.json @@ -7,7 +7,7 @@ "extra_flags": [ "-D CDEBYTE_EORA_S3", "-D ARDUINO_USB_CDC_ON_BOOT=1", - "-D ARDUINO_USB_MODE=0", + "-D ARDUINO_USB_MODE=1", "-D ARDUINO_RUNNING_CORE=1", "-D ARDUINO_EVENT_RUNNING_CORE=1", "-D BOARD_HAS_PSRAM" diff --git a/boards/ThinkNode-M7.json b/boards/ThinkNode-M7.json new file mode 100644 index 00000000000..2a0c5e5838a --- /dev/null +++ b/boards/ThinkNode-M7.json @@ -0,0 +1,42 @@ +{ + "build": { + "arduino": { + "ldscript": "esp32s3_out.ld", + "memory_type": "qio_opi" + }, + "core": "esp32", + "extra_flags": [ + "-D BOARD_HAS_PSRAM", + "-D ARDUINO_USB_CDC_ON_BOOT=0", + "-D ARDUINO_USB_MODE=0", + "-D ARDUINO_RUNNING_CORE=1", + "-D ARDUINO_EVENT_RUNNING_CORE=0" + ], + "f_cpu": "240000000L", + "f_flash": "80000000L", + "flash_mode": "qio", + "psram_type": "qio_opi", + "hwids": [["0x303A", "0x1001"]], + "mcu": "esp32s3", + "variant": "ELECROW-ThinkNode-M7" + }, + "connectivity": ["wifi", "bluetooth", "lora"], + "debug": { + "default_tool": "esp-builtin", + "onboard_tools": ["esp-builtin"], + "openocd_target": "esp32s3.cfg" + }, + "frameworks": ["arduino", "espidf"], + "name": "ELECROW ThinkNode M7", + "upload": { + "flash_size": "8MB", + "maximum_ram_size": 524288, + "maximum_size": 8388608, + "use_1200bps_touch": true, + "wait_for_upload_port": true, + "require_upload_port": true, + "speed": 921600 + }, + "url": "https://www.elecrow.com", + "vendor": "ELECROW" +} diff --git a/boards/bpi_picow_esp32_s3.json b/boards/bpi_picow_esp32_s3.json index 75983d8450d..62ad666f154 100644 --- a/boards/bpi_picow_esp32_s3.json +++ b/boards/bpi_picow_esp32_s3.json @@ -6,7 +6,7 @@ "core": "esp32", "extra_flags": [ "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1", "-DBOARD_HAS_PSRAM" diff --git a/boards/hackaday-communicator.json b/boards/hackaday-communicator.json index 6e6c1ad2d09..5aedf5d19b7 100644 --- a/boards/hackaday-communicator.json +++ b/boards/hackaday-communicator.json @@ -8,7 +8,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/heltec_v4_r8.json b/boards/heltec_v4_r8.json new file mode 100644 index 00000000000..6dd97c84b22 --- /dev/null +++ b/boards/heltec_v4_r8.json @@ -0,0 +1,43 @@ +{ + "build": { + "arduino": { + "ldscript": "esp32s3_out.ld", + "partitions": "default_16MB.csv", + "memory_type": "qio_opi" + }, + "core": "esp32", + "extra_flags": [ + "-DBOARD_HAS_PSRAM", + "-DARDUINO_USB_CDC_ON_BOOT=1", + "-DARDUINO_USB_MODE=1", + "-DARDUINO_RUNNING_CORE=1", + "-DARDUINO_EVENT_RUNNING_CORE=1" + ], + "f_cpu": "240000000L", + "f_flash": "80000000L", + "flash_mode": "qio", + "psram_type": "opi", + "hwids": [["0x303A", "0x1001"]], + "mcu": "esp32s3", + "variant": "heltec_v4_r8" + }, + "connectivity": ["wifi", "bluetooth", "lora"], + "debug": { + "default_tool": "esp-builtin", + "onboard_tools": ["esp-builtin"], + "openocd_target": "esp32s3.cfg" + }, + "frameworks": ["arduino", "espidf"], + "name": "heltec_wifi_lora_32 v4 r8 (16 MB FLASH, 8 MB PSRAM)", + "upload": { + "flash_size": "16MB", + "maximum_ram_size": 327680, + "maximum_size": 16777216, + "use_1200bps_touch": true, + "wait_for_upload_port": true, + "require_upload_port": true, + "speed": 921600 + }, + "url": "https://heltec.org/", + "vendor": "heltec" +} diff --git a/boards/heltec_vision_master_e213.json b/boards/heltec_vision_master_e213.json index 152515cf375..d9d5f85824e 100644 --- a/boards/heltec_vision_master_e213.json +++ b/boards/heltec_vision_master_e213.json @@ -9,7 +9,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/heltec_vision_master_e290.json b/boards/heltec_vision_master_e290.json index b7cbac8786f..171125338ad 100644 --- a/boards/heltec_vision_master_e290.json +++ b/boards/heltec_vision_master_e290.json @@ -9,7 +9,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/heltec_vision_master_t190.json b/boards/heltec_vision_master_t190.json index 440f76ad01f..fbdf1f09d89 100644 --- a/boards/heltec_vision_master_t190.json +++ b/boards/heltec_vision_master_t190.json @@ -9,7 +9,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/heltec_wireless_tracker.json b/boards/heltec_wireless_tracker.json index 04c6e5553f8..59d0daa1597 100644 --- a/boards/heltec_wireless_tracker.json +++ b/boards/heltec_wireless_tracker.json @@ -8,7 +8,7 @@ "extra_flags": [ "-DHELTEC_WIRELESS_TRACKER", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/heltec_wireless_tracker_v2.json b/boards/heltec_wireless_tracker_v2.json index 502954e69d0..3d20f7edbcc 100644 --- a/boards/heltec_wireless_tracker_v2.json +++ b/boards/heltec_wireless_tracker_v2.json @@ -7,7 +7,7 @@ "core": "esp32", "extra_flags": [ "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/nrf54l15dk.json b/boards/nrf54l15dk.json new file mode 100644 index 00000000000..863ad290b4d --- /dev/null +++ b/boards/nrf54l15dk.json @@ -0,0 +1,26 @@ +{ + "build": { + "cpu": "cortex-m33", + "f_cpu": "128000000L", + "mcu": "nrf54l15", + "zephyr": { + "variant": "nrf54l15dk/nrf54l15/cpuapp" + } + }, + "connectivity": ["bluetooth"], + "debug": { + "default_tools": ["jlink"], + "jlink_device": "nRF54L15_M33", + "svd_path": "nrf54l15.svd" + }, + "frameworks": ["zephyr"], + "name": "Nordic nRF54L15-DK (PCA10156)", + "upload": { + "maximum_ram_size": 262144, + "maximum_size": 1572864, + "protocol": "jlink", + "protocols": ["jlink"] + }, + "url": "https://www.nordicsemi.com/Products/nRF54L15", + "vendor": "Nordic Semiconductor" +} diff --git a/boards/seeed-xiao-s3.json b/boards/seeed-xiao-s3.json index 6981085ddc8..d8e5d6b94ae 100644 --- a/boards/seeed-xiao-s3.json +++ b/boards/seeed-xiao-s3.json @@ -8,7 +8,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=0" ], diff --git a/boards/station-g3.json b/boards/station-g3.json new file mode 100644 index 00000000000..615f8bb4013 --- /dev/null +++ b/boards/station-g3.json @@ -0,0 +1,41 @@ +{ + "build": { + "arduino": { + "ldscript": "esp32s3_out.ld", + "memory_type": "qio_opi" + }, + "core": "esp32", + "extra_flags": [ + "-DBOARD_HAS_PSRAM", + "-DARDUINO_USB_CDC_ON_BOOT=1", + "-DARDUINO_USB_MODE=1", + "-DARDUINO_RUNNING_CORE=1", + "-DARDUINO_EVENT_RUNNING_CORE=0" + ], + "f_cpu": "240000000L", + "f_flash": "80000000L", + "flash_mode": "qio", + "hwids": [["0x303A", "0x1001"]], + "mcu": "esp32s3", + "variant": "station-g3" + }, + "connectivity": ["wifi", "bluetooth", "lora"], + "debug": { + "default_tool": "esp-builtin", + "onboard_tools": ["esp-builtin"], + "openocd_target": "esp32s3.cfg" + }, + "frameworks": ["arduino", "espidf"], + "name": "BQ Station G3", + "upload": { + "flash_size": "16MB", + "maximum_ram_size": 327680, + "maximum_size": 16777216, + "use_1200bps_touch": true, + "wait_for_upload_port": true, + "require_upload_port": true, + "speed": 921600 + }, + "url": "", + "vendor": "BQ Consulting" +} diff --git a/boards/t-beam-1w.json b/boards/t-beam-1w.json index 40f16195d0f..80776ee055f 100644 --- a/boards/t-beam-1w.json +++ b/boards/t-beam-1w.json @@ -9,7 +9,7 @@ "-DBOARD_HAS_PSRAM", "-DLILYGO_TBEAM_1W", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/t-deck.json b/boards/t-deck.json index b112921b9b5..33a34b60dcf 100644 --- a/boards/t-deck.json +++ b/boards/t-deck.json @@ -8,7 +8,7 @@ "extra_flags": [ "-DBOARD_HAS_PSRAM", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/tbeam-s3-core.json b/boards/tbeam-s3-core.json index 7bda2e5a0a3..8d2c3eed6a2 100644 --- a/boards/tbeam-s3-core.json +++ b/boards/tbeam-s3-core.json @@ -8,7 +8,7 @@ "-DBOARD_HAS_PSRAM", "-DLILYGO_TBEAM_S3_CORE", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/boards/tlora-t3s3-v1.json b/boards/tlora-t3s3-v1.json index 0bfd17afc29..652b4178ebe 100644 --- a/boards/tlora-t3s3-v1.json +++ b/boards/tlora-t3s3-v1.json @@ -7,7 +7,7 @@ "extra_flags": [ "-DLILYGO_T3S3_V1", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1", "-DBOARD_HAS_PSRAM" diff --git a/boards/unphone.json b/boards/unphone.json index 4d37f7bb52d..72075f5aef5 100644 --- a/boards/unphone.json +++ b/boards/unphone.json @@ -10,7 +10,7 @@ "-DBOARD_HAS_PSRAM", "-DUNPHONE_SPIN=9", "-DARDUINO_USB_CDC_ON_BOOT=1", - "-DARDUINO_USB_MODE=0", + "-DARDUINO_USB_MODE=1", "-DARDUINO_RUNNING_CORE=1", "-DARDUINO_EVENT_RUNNING_CORE=1" ], diff --git a/debian/changelog b/debian/changelog index c3f1424a5d3..6b9d0668efd 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +meshtasticd (2.7.24.0) unstable; urgency=medium + + * Version 2.7.24 + + -- GitHub Actions Fri, 08 May 2026 10:44:12 +0000 + meshtasticd (2.7.23.0) unstable; urgency=medium * Version 2.7.23 diff --git a/debian/ci_pack_sdeb.sh b/debian/ci_pack_sdeb.sh index 7b2418ff671..d35aeef24e3 100755 --- a/debian/ci_pack_sdeb.sh +++ b/debian/ci_pack_sdeb.sh @@ -1,4 +1,5 @@ #!/usr/bin/bash +set -e export DEBEMAIL="jbennett@incomsystems.biz" export PLATFORMIO_LIBDEPS_DIR=pio/libdeps export PLATFORMIO_PACKAGES_DIR=pio/packages diff --git a/default_16MB.csv b/default_16MB.csv new file mode 100644 index 00000000000..67d773728e9 --- /dev/null +++ b/default_16MB.csv @@ -0,0 +1,7 @@ +# Name, Type, SubType, Offset, Size, Flags +nvs, data, nvs, 0x9000, 0x5000, +otadata, data, ota, 0xe000, 0x2000, +app0, app, ota_0, 0x10000, 0x640000, +app1, app, ota_1, 0x650000,0x640000, +spiffs, data, spiffs, 0xc90000,0x360000, +coredump, data, coredump,0xFF0000,0x10000, diff --git a/default_8MB.csv b/default_8MB.csv new file mode 100644 index 00000000000..4e92afa6936 --- /dev/null +++ b/default_8MB.csv @@ -0,0 +1,7 @@ +# Name, Type, SubType, Offset, Size, Flags +nvs, data, nvs, 0x9000, 0x5000, +otadata, data, ota, 0xe000, 0x2000, +app0, app, ota_0, 0x10000, 0x330000, +app1, app, ota_1, 0x340000,0x330000, +spiffs, data, spiffs, 0x670000,0x180000, +coredump, data, coredump,0x7F0000,0x10000, diff --git a/extra_scripts/esp32_extra.py b/extra_scripts/esp32_extra.py index f7698561af9..975ec0f30da 100755 --- a/extra_scripts/esp32_extra.py +++ b/extra_scripts/esp32_extra.py @@ -70,17 +70,6 @@ def esp32_create_combined_bin(source, target, env): env.AddPostAction("$BUILD_DIR/${PROGNAME}.bin", esp32_create_combined_bin) -esp32_kind = env.GetProjectOption("custom_esp32_kind") -if esp32_kind == "esp32": - # Free up some IRAM by removing auxiliary SPI flash chip drivers. - # Wrapped stub symbols are defined in src/platform/esp32/iram-quirk.c. - env.Append( - LINKFLAGS=[ - "-Wl,--wrap=esp_flash_chip_gd", - "-Wl,--wrap=esp_flash_chip_issi", - "-Wl,--wrap=esp_flash_chip_winbond", - ] - ) -else: - # For newer ESP32 targets, using newlib nano works better. - env.Append(LINKFLAGS=["--specs=nano.specs", "-u", "_printf_float"]) +# Enable Newlib Nano formatting to save space +# ...but allow printf float support (compromise) +env.Append(LINKFLAGS=["--specs=nano.specs", "-u", "_printf_float"]) diff --git a/extra_scripts/ld_response_file.py b/extra_scripts/ld_response_file.py new file mode 100644 index 00000000000..e79475f197c --- /dev/null +++ b/extra_scripts/ld_response_file.py @@ -0,0 +1,23 @@ +#!/usr/bin/env python3 +# trunk-ignore-all(ruff/F821) +# trunk-ignore-all(flake8/F821): For SConstruct imports + +# force linker response file instead of command line arguments + +Import("env") + + +def wrap_with_tempfile(command_key): + command = env.get(command_key) + if not command or not isinstance(command, str): + return + if "TEMPFILE(" in command: + return + env.Replace(**{command_key: "${TEMPFILE('%s')}" % command}) + + +# Force SCons to spill long commands into response files on this target. +env.Replace(MAXLINELENGTH=8192) + +for key in ("LINKCOM", "CXXLINKCOM", "SHLINKCOM", "SHCXXLINKCOM"): + wrap_with_tempfile(key) diff --git a/extra_scripts/nrf54l15_linker.py b/extra_scripts/nrf54l15_linker.py new file mode 100644 index 00000000000..fa104155307 --- /dev/null +++ b/extra_scripts/nrf54l15_linker.py @@ -0,0 +1,140 @@ +#!/usr/bin/env python3 +# trunk-ignore-all(ruff/F821) +# trunk-ignore-all(flake8/F821): For SConstruct imports +# +# post:extra_scripts/nrf54l15_linker.py +# +# Fix for Zephyr two-pass link on nRF54L15: +# platformio-build.py registers env.Depends("$PROG_PATH", final_ld_script) but +# the SCons dependency chain is broken (final_ld_script Command never runs). +# This script adds a PreAction on the final firmware binary that runs the gcc +# preprocessing command directly (extracted from build.ninja) to generate +# zephyr/linker.cmd before the link step. +# +# PlatformIO bundles an old Ninja that can't handle multi-output depslog rules, +# so we parse the COMMAND line from build.ninja and run just the gcc -E part, +# skipping the cmake_transform_depfile step (only needed for Ninja deps tracking). + +import os +import re +import subprocess + +Import("env") + +if env.get("PIOENV") != "nrf54l15dk": + pass # Only for the nrf54l15dk environment +else: + + def _extract_gcc_command(ninja_build): + """Parse build.ninja to find the gcc -E command that generates linker.cmd. + + The rule format depends on the host: + Windows (CMake's RunCMake wraps every command): + COMMAND = cmd.exe /C "cd /D DIR && arm-none-eabi-gcc.exe ... -o linker.cmd && cmake.exe -E cmake_transform_depfile ..." + POSIX (Linux/macOS — no wrapper): + COMMAND = cd DIR && arm-none-eabi-gcc ... -o linker.cmd && cmake -E cmake_transform_depfile ... + + Returns (gcc_cmd_string, cwd_path) or raises RuntimeError. + """ + in_rule = False + with open(ninja_build, "r", encoding="utf-8", errors="replace") as f: + for line in f: + # Detect start of the linker.cmd custom command rule + if not in_rule: + if "build zephyr/linker.cmd" in line and "CUSTOM_COMMAND" in line: + in_rule = True + continue + + stripped = line.strip() + if not stripped.startswith("COMMAND = "): + continue + + command_val = stripped[len("COMMAND = ") :] + + # On Windows the value is wrapped in `cmd.exe /C "..."` — strip + # the wrapper. On POSIX hosts the inner sequence is the value + # itself (no quoting layer). + m = re.search(r'/C\s+"(.*)"\s*$', command_val) + inner = m.group(1) if m else command_val + parts = inner.split(" && ") + + cwd = None + gcc_cmd = None + for part in parts: + part = part.strip() + if part.startswith("cd /D "): # Windows form + cwd = part[len("cd /D ") :] + elif part.startswith("cd "): # POSIX form + cwd = part[len("cd ") :] + elif "arm-none-eabi-gcc" in part: + gcc_cmd = part + + if not gcc_cmd: + raise RuntimeError( + "nRF54L15 linker fix: arm-none-eabi-gcc command not found in:\n%s" + % inner[:400] + ) + + return gcc_cmd, cwd + + raise RuntimeError( + "nRF54L15 linker fix: 'build zephyr/linker.cmd' rule not found in build.ninja" + ) + + def _generate_linker_cmd(target, source, env): + """Generate zephyr/linker.cmd via direct gcc invocation before the final link.""" + build_dir = env.subst("$BUILD_DIR") + zephyr_dir = os.path.join(build_dir, "zephyr") + linker_cmd = os.path.join(zephyr_dir, "linker.cmd") + + if os.path.exists(linker_cmd): + return # Already present — nothing to do + + ninja_build = os.path.join(build_dir, "build.ninja") + if not os.path.exists(ninja_build): + raise RuntimeError( + "nRF54L15 linker fix: build.ninja not found at %s\n" + "Run a full build first so CMake generates the Ninja files." + % ninja_build + ) + + gcc_cmd, cwd = _extract_gcc_command(ninja_build) + run_cwd = cwd if cwd else zephyr_dir + + print( + "==> nRF54L15: Generating zephyr/linker.cmd (LINKER_ZEPHYR_FINAL) via GCC" + ) + # gcc_cmd comes verbatim from our own build.ninja (never user input) and + # contains Windows-style paths with spaces that cannot be safely argv-split + # with shlex, so we run it via the platform shell. nosec/nosemgrep below + # acknowledge this deliberate, scoped use of shell=True. + result = subprocess.run( # nosec B602 + gcc_cmd, + shell=True, # nosemgrep: python.lang.security.audit.subprocess-shell-true.subprocess-shell-true + cwd=run_cwd, + capture_output=True, + text=True, + ) + if result.returncode != 0: + print("GCC stdout:", result.stdout[:2000]) + print("GCC stderr:", result.stderr[:2000]) + raise RuntimeError( + "nRF54L15 linker fix: GCC failed to generate linker.cmd (rc=%d)" + % result.returncode + ) + if not os.path.exists(linker_cmd): + raise RuntimeError( + "nRF54L15 linker fix: GCC returned 0 but linker.cmd was not created at %s" + % linker_cmd + ) + print("==> linker.cmd generated successfully") + + # Use PIOMAINPROG (set by ZephyrBuildProgram) to get the exact SCons node + prog = env.get("PIOMAINPROG") + if prog: + env.AddPreAction(prog, _generate_linker_cmd) + else: + print( + "[nrf54l15_linker] WARNING: PIOMAINPROG not set, falling back to $PROG_PATH" + ) + env.AddPreAction(env.subst("$PROG_PATH"), _generate_linker_cmd) diff --git a/mcp-server/.gitignore b/mcp-server/.gitignore new file mode 100644 index 00000000000..744a4401de0 --- /dev/null +++ b/mcp-server/.gitignore @@ -0,0 +1,35 @@ +.venv/ +__pycache__/ +*.py[cod] +*.egg-info/ +.pytest_cache/ +.mypy_cache/ +dist/ +build/ + +# Persistent device-log capture (recorder + Datadog cursor). +# Cross-session JSONL streams written by the autouse Recorder singleton +# (see src/meshtastic_mcp/recorder/). Lives outside tests/ so the pytest +# fixture truncate doesn't touch it. +.mtlog/ + +# Test harness artifacts +tests/report.html +tests/junit.xml +tests/reportlog.jsonl +tests/fwlog.jsonl +# Subprocess-output tee from pio/esptool/nrfutil/picotool (live flash +# progress for the TUI; also a post-run diagnostic for plain CLI runs). +tests/flash.log +tests/tool_coverage.json +tests/.coverage +htmlcov/ +# Persistent run counter for meshtastic-mcp-test-tui header. +tests/.tui-runs +# Cross-run history (TUI duration sparkline). +tests/.history/ +# Reproducer bundles (TUI `x` export on failed tests). +tests/reproducers/ +# UI-tier camera captures + per-test transcripts. Regenerated every run; +# left on disk for human review between runs. +tests/ui_captures/ diff --git a/mcp-server/README.md b/mcp-server/README.md new file mode 100644 index 00000000000..22ce77fbcb6 --- /dev/null +++ b/mcp-server/README.md @@ -0,0 +1,412 @@ +# Meshtastic MCP Server + +An [MCP](https://modelcontextprotocol.io) server for working with the Meshtastic firmware repo and connected devices. Lets Claude Code / Claude Desktop: + +- Discover USB-connected Meshtastic devices +- Enumerate PlatformIO board variants (166+) with Meshtastic metadata +- Build, clean, flash, erase-and-flash (factory), and OTA-update firmware +- Read serial logs via `pio device monitor` (with board-specific exception decoders) +- Trigger 1200bps touch-reset for bootloader entry (nRF52, ESP32-S3, RP2040) +- Query and administer a running node via the [`meshtastic` Python API](https://github.com/meshtastic/python): owner name, config (LocalConfig + ModuleConfig), channels, messaging, reboot/shutdown/factory-reset +- Call `esptool`, `nrfutil`, `picotool` directly when PlatformIO doesn't cover the operation + +## Design principle + +**PlatformIO first.** Its `pio run -t upload` knows the correct protocol, offsets, and post-build chain for every variant in `variants/`. Direct vendor-tool wrappers (`esptool_*`, `nrfutil_*`, `picotool_*`) exist as escape hatches for operations pio doesn't cover (blank-chip erase, DFU `.zip` packages, BOOTSEL-mode inspection). + +## Prerequisites + +- Python ≥ 3.11 +- [PlatformIO Core](https://platformio.org/install/cli) — `pio` on `$PATH` or at `~/.platformio/penv/bin/pio` +- The Meshtastic firmware repo checked out somewhere (set via `MESHTASTIC_FIRMWARE_ROOT`) +- Optional: `esptool`, `nrfutil`, `picotool` on `$PATH` (or under the firmware venv at `.venv/bin/`) if you want to use the direct-tool wrappers + +## Install + +```bash +cd /mcp-server +python3 -m venv .venv +.venv/bin/pip install -e . +``` + +Verify: + +```bash +MESHTASTIC_FIRMWARE_ROOT= .venv/bin/python -m meshtastic_mcp +``` + +The server blocks on stdin (that's correct — it speaks MCP over stdio). Ctrl-C to exit. + +## Register with Claude Code + +Edit `~/.claude/settings.json` (global) or `/.claude/settings.local.json` (project-only): + +```json +{ + "mcpServers": { + "meshtastic": { + "command": "/mcp-server/.venv/bin/python", + "args": ["-m", "meshtastic_mcp"], + "env": { + "MESHTASTIC_FIRMWARE_ROOT": "" + } + } + } +} +``` + +Replace `` with the absolute path, e.g. `/Users/you/GitHub/firmware`. Restart Claude Code after editing. + +## Register with Claude Desktop + +Same `mcpServers` block, but in `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows). + +## Tools (43) + +### Discovery & metadata + +| Tool | What it does | +| -------------- | ------------------------------------------------------------------------------------------ | +| `list_devices` | USB/serial port listing, flags likely-Meshtastic candidates | +| `list_boards` | PlatformIO envs with `custom_meshtastic_*` metadata; filters by arch/supported/query/level | +| `get_board` | Full env dict incl. raw pio config | + +### Build & flash + +| Tool | What it does | +| ----------------- | -------------------------------------------------------------------- | +| `build` | `pio run -e ` (+ mtjson target) | +| `clean` | `pio run -e -t clean` | +| `pio_flash` | `pio run -e -t upload --upload-port ` — any architecture | +| `erase_and_flash` | ESP32 full factory flash via `bin/device-install.sh` | +| `update_flash` | ESP32 OTA app-partition update via `bin/device-update.sh` | +| `touch_1200bps` | 1200-baud open/close to trigger USB CDC bootloader entry | + +### Serial log sessions + +Backed by long-running `pio device monitor` subprocesses with a 10k-line ring buffer per session and board-specific filters (`esp32_exception_decoder` auto-selected when you pass `env=`). + +| Tool | What it does | +| -------------- | ------------------------------------------------------------------ | +| `serial_open` | Start a monitor session; returns `session_id` | +| `serial_read` | Cursor-based pull; reports `dropped` if lines aged out of the ring | +| `serial_list` | All active sessions | +| `serial_close` | Terminate a session | + +### Device reads + +| Tool | What it does | +| ------------- | --------------------------------------------------------------------------- | +| `device_info` | my_node_num, long/short name, firmware version, region, channel, node count | +| `list_nodes` | Full node database with position, SNR, RSSI, last_heard, battery | + +_The tool tables below document 38 currently registered MCP server tools._ + +### Device writes + +| Tool | What it does | +| ------------------- | -------------------------------------------------------------------------- | +| `set_owner` | Long name + optional short name (≤4 chars) | +| `get_config` | One section or all (LocalConfig + ModuleConfig) | +| `set_config` | Dot-path field write: `lora.region`=`"US"`, `device.role`=`"ROUTER"`, etc. | +| `get_channel_url` | Primary-only or include_all=admin URL | +| `set_channel_url` | Import channels from a Meshtastic URL | +| `set_debug_log_api` | Enable or disable debug logging for the Meshtastic Python API client | +| `send_text` | Broadcast or direct text message | +| `reboot` | `localNode.reboot(secs)` — requires `confirm=True` | +| `shutdown` | `localNode.shutdown(secs)` — requires `confirm=True` | +| `factory_reset` | `localNode.factoryReset(full?)` — requires `confirm=True` | + +### Direct hardware tools (escape hatches) + +| Tool | What it does | +| --------------------- | --------------------------------------------------------- | +| `esptool_chip_info` | Read chip, MAC, crystal, flash size | +| `esptool_erase_flash` | Full-chip erase (destructive) | +| `esptool_raw` | Pass-through; confirm=True required for write/erase/merge | +| `nrfutil_dfu` | DFU-flash a `.zip` package | +| `nrfutil_raw` | Pass-through | +| `picotool_info` | Read Pico BOOTSEL-mode info | +| `picotool_load` | Load a UF2 | +| `picotool_raw` | Pass-through | + +### USB power control (uhubctl) + +| Tool | What it does | +| --------------- | ----------------------------------------------------------- | +| `uhubctl_list` | Enumerate USB hubs + attached-device VID/PID (read-only) | +| `uhubctl_power` | Drive a hub port `on` or `off`; `off` requires confirm=True | +| `uhubctl_cycle` | Off → wait `delay_s` → on; confirm=True required | + +Target a port by explicit `(location, port)` (raw uhubctl syntax like +`location="1-1.3", port=2`) or by `role` (`"nrf52"`, `"esp32s3"`). Role +lookup checks `MESHTASTIC_UHUBCTL_LOCATION_` + +`MESHTASTIC_UHUBCTL_PORT_` env vars first, then auto-detects via VID +against `uhubctl`'s output. + +Requires [`uhubctl`](https://github.com/mvp/uhubctl) on PATH: + +```bash +brew install uhubctl # macOS +apt install uhubctl # Debian/Ubuntu +``` + +Modern macOS + PPPS-capable hubs generally work without root. On Linux +without udev rules, or on old macOS with driver quirks, you may need +`sudo`. If uhubctl returns a permission error the MCP tool raises a +clear `UhubctlError` pointing at the +[udev-rules / sudo fallback](https://github.com/mvp/uhubctl#linux-usb-permissions) +rather than auto-`sudo`'ing mid-run. + +## Safety + +- **All destructive flash/admin tools require `confirm=True`** as a tool-level gate, on top of any permission prompt from Claude. +- **Serial port is exclusive.** If a `serial_*` session is active on a port, `device_info`/admin tools on the same port will fail fast with a pointer at the active `session_id`. Close the session first. +- **Flash confirmation by architecture**: `erase_and_flash` / `update_flash` error if the env's architecture isn't ESP32 — use `pio_flash` for nRF52/RP2040/STM32. + +## Environment variables + +| Var | Default | Purpose | +| -------------------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- | +| `MESHTASTIC_FIRMWARE_ROOT` | walks up from cwd for `platformio.ini` | Pin the firmware repo | +| `MESHTASTIC_PIO_BIN` | `~/.platformio/penv/bin/pio` → `$PATH` `pio` → `platformio` | Override `pio` location | +| `MESHTASTIC_ESPTOOL_BIN` | `/.venv/bin/esptool` → `$PATH` | Override esptool | +| `MESHTASTIC_NRFUTIL_BIN` | `$PATH` | Override nrfutil | +| `MESHTASTIC_PICOTOOL_BIN` | `$PATH` | Override picotool | +| `MESHTASTIC_MCP_SEED` | `mcp--` | PSK seed for test-harness session (CI override) | +| `MESHTASTIC_MCP_FLASH_LOG` | `/tests/flash.log` | Tee target for pio/esptool/nrfutil subprocess output (TUI tails it) | +| `MESHTASTIC_MCP_TCP_HOST` | unset | `host` or `host:port` of a `meshtasticd` daemon to surface as a TCP device (see "TCP / native-host nodes" below) | + +## TCP / native-host nodes + +The `native-macos` and `native` PlatformIO envs build a headless `meshtasticd` +binary that runs on the host (Apple Silicon / Intel macOS, or Linux Portduino). +The daemon exposes the meshtastic TCP API on port `4403` rather than a USB +serial endpoint — point the MCP server at it via `MESHTASTIC_MCP_TCP_HOST`: + +```bash +# 1. Build + run a daemon on this host (see variants/native/portduino/platformio.ini +# for full Homebrew prereqs and CH341 LoRa-adapter setup). +pio run -e native-macos +~/.meshtasticd/meshtasticd + +# 2. Point the MCP server at it. +export MESHTASTIC_MCP_TCP_HOST=localhost # or host:port, default port 4403 +``` + +**First-run gotcha — MAC address.** `meshtasticd` derives its MAC from the +USB adapter's serial-number / product strings. Many cheap CH341 dongles +(MeshStick included — VID 0x1A86 / PID 0x5512) ship with `iSerialNumber=0` +and `iProduct=0`, so the daemon aborts on boot with `*** Blank MAC Address +not allowed!`. Set the MAC explicitly in `config.yaml`: + +```yaml +# Under General: +MACAddress: 02:CA:FE:BA:BE:01 +``` + +Use a locally-administered address (first byte's second-LSB set, e.g. +`02:*` / `06:*` / `0A:*` / `0E:*`) to avoid colliding with a real OUI. + +There is also a `--hwid AA:BB:CC:DD:EE:FF` CLI flag visible in +`meshtasticd --help`, but it is **currently broken** in +`MAC_from_string()` (`src/platform/portduino/PortduinoGlue.cpp`): the +function strips colons from its parameter but then reads bytes from the +global `portduino_config.mac_address`, so `--hwid` is silently overridden +when `MACAddress:` is also set, and crashes the daemon (uncaught +`std::invalid_argument: stoi: no conversion`) when it isn't. Use the YAML +form until that's fixed upstream. + +`list_devices` will surface the daemon as `tcp://localhost:4403` with +`likely_meshtastic=True`, so `device_info`, `list_nodes`, `get_config`, +`set_config`, `set_owner`, `send_text`, `userprefs_*`, and the admin RPCs +auto-select it when no `port` is passed. Pass `port="tcp://other-host:9999"` +explicitly to target a different daemon. + +**Tools that don't apply to a TCP/native node** (no USB hardware to operate +on) raise a clear `ConnectionError` rather than failing mysteriously: +`pio_flash`, `erase_and_flash`, `update_flash`, `touch_1200bps`, +`serial_open` (use info/admin tools directly), and the vendor escape hatches +`esptool_*`, `nrfutil_*`, `picotool_*`. `pio_flash` against a `native*` env +similarly raises — there's no upload step; use `build` and run the binary +directly. + +The pytest harness in `tests/` still assumes USB-attached devices per role — +TCP-aware fixtures are not part of this surface yet. + +## Hardware Test Suite + +`mcp-server/tests/` holds a pytest-based integration suite that exercises +real USB-connected Meshtastic devices against the MCP server surface. Separate +from the native C++ unit tests in the firmware repo's top-level `test/` +directory — this one validates the device-facing behavior end-to-end. + +### Invocation + +```bash +./mcp-server/run-tests.sh # full suite (auto-detect + auto-bake-if-needed) +./mcp-server/run-tests.sh --force-bake # reflash devices before testing +./mcp-server/run-tests.sh --assume-baked # skip the bake step (caller vouches for state) +./mcp-server/run-tests.sh tests/mesh # one tier +./mcp-server/run-tests.sh tests/mesh/test_traceroute.py # one file +./mcp-server/run-tests.sh -k telemetry # pytest name filter +``` + +The wrapper auto-detects connected devices (VID `0x239A` → `nrf52` → env +`rak4631`; `0x303A` or `0x10C4` → `esp32s3` → env `heltec-v3`), exports +`MESHTASTIC_MCP_ENV_` env vars, and invokes pytest. Overrides via +per-role env vars: `MESHTASTIC_MCP_ENV_NRF52=heltec-mesh-node-t114 ./run-tests.sh`. + +No hardware connected? The wrapper narrows to `tests/unit/` only and says so +in the pre-flight header. + +### Tiers (run in this order) + +- **`bake`** (`tests/test_00_bake.py`) — flashes both hub roles with the + session's test profile. Has a skip-if-already-baked check (region + channel + match); `--force-bake` overrides. +- **`unit`** — pure Python, no hardware. boards / PIO wrapper / + userPrefs-parse / testing-profile fixtures. +- **`mesh`** — 2-device mesh: formation, broadcast delivery, direct+ACK, + traceroute, bidirectional. Parametrized over both directions. Includes + `test_peer_offline_recovery` which uses uhubctl to power-cycle one peer + mid-conversation and verifies the mesh recovers (skips without uhubctl). +- **`telemetry`** — periodic telemetry broadcast + on-demand request/reply + (`TELEMETRY_APP` with `wantResponse=True`). +- **`monitor`** — boot log has no panic markers within 60 s of reboot. +- **`recovery`** — `uhubctl` power-cycle round-trip: verifies the hub port + can be toggled off/on, the device re-enumerates with the same + `my_node_num`, and NVS-resident config (region, channel, modem preset) + survives a hard reset. Requires `uhubctl` on PATH; skips cleanly otherwise. +- **`ui`** — input-broker-driven screen navigation (`AdminMessage.send_input_event` + injection → `Screen::handleInputEvent` → frame transition). Parametrized + on the screen-bearing role (heltec-v3 OLED). Captures images via USB + webcam + OCRs them for HTML-report evidence. Requires `pip install -e '.[ui]'` + and `MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3=`; tier is auto-deselected + if `cv2` isn't importable. +- **`fleet`** — PSK-seed isolation: two labs with different seeds never + overlap. +- **`admin`** — owner persistence across reboot, channel URL round-trip, + `lora.hop_limit` persistence. +- **`provisioning`** — region/channel baking, userPrefs survive + `factory_reset(full=False)`. + +#### UI tier setup + +The `tests/ui/` tier drives the on-device OLED via the firmware's existing +`AdminMessage.send_input_event` RPC (no firmware changes required) and +verifies transitions via a macro-gated log line + camera + OCR. Summary: + +1. Install extras: `pip install -e 'mcp-server/.[ui]'` — pulls in + `opencv-python-headless`, `numpy`, `easyocr`, `Pillow`. First easyocr + run downloads ~100 MB of models to `~/.EasyOCR/`; an autouse session + fixture pre-warms the reader so per-test OCR is <100 ms after that. +2. Point a USB webcam at the heltec-v3 OLED. Discover its index: + ```bash + .venv/bin/python -c "import cv2; [print(i, cv2.VideoCapture(i).read()[0]) for i in range(5)]" + ``` +3. Export the per-role device env var: + ```bash + export MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3=0 + ``` +4. Run: + ```bash + ./run-tests.sh tests/ui -v + ``` + Captures land under `tests/ui_captures///`, one + PNG + `.ocr.txt` per `frame_capture()` call, with a per-test + `transcript.md` stepping through event → frame → OCR. The HTML report + embeds the full image strip inline (pass or fail). + +On macOS, `cv2.VideoCapture(0)` triggers the TCC Camera permission prompt +on first use. Pre-grant Terminal (or your IDE's terminal) before running. +The `OpenCVBackend` fails fast on 10 consecutive black frames so a silent +permission denial surfaces as a clear error, not an empty PNG strip. + +No camera? Set `MESHTASTIC_UI_CAMERA_BACKEND=null` (or leave the device var +unset). Tests still exercise the event-injection path and log assertions; +captures just become 1×1 black PNGs. + +### Artifacts (regenerated every run, under `tests/`) + +- `report.html` — self-contained pytest-html report. Each test gets a + **Meshtastic debug** section attached on failure with a 200-line firmware + log tail + device-state dump. Open this first on failures. +- `junit.xml` — CI-parseable. +- `reportlog.jsonl` — `pytest-reportlog` event stream; consumed by the TUI. +- `fwlog.jsonl` — firmware log mirror (`meshtastic.log.line` pubsub → JSONL). +- `flash.log` — tee of all pio / esptool / nrfutil / picotool subprocess + output during the run (driven by `MESHTASTIC_MCP_FLASH_LOG`). + +### Live TUI + +```bash +.venv/bin/meshtastic-mcp-test-tui +.venv/bin/meshtastic-mcp-test-tui tests/mesh # pytest args pass through +``` + +Textual-based wrapper over `run-tests.sh` with a live test tree, tier +counters, pytest output pane, firmware-log pane, and a device-status strip. +Key bindings: `r` re-run focused, `f` filter, `d` failure detail, `g` open +`report.html`, `x` export reproducer bundle, `l` cycle fw-log filter, `q` +quit (SIGINT → SIGTERM → SIGKILL escalation). + +Set `MESHTASTIC_UI_TUI_CAMERA=1` to mount a bottom-of-screen **UI camera** +panel. Left side: the latest capture PNG rendered as Unicode half-blocks +(via `rich-pixels`, works in any terminal — no kitty/sixel required). +Right side: live transcript tail ("step 3 — frame 4/8 name=nodelist_nodes +— OCR: Nodes 2/2") so you can see every event-injection and its result +as each UI test runs. Requires the `[ui]` extras for image rendering; the +transcript alone works without them. + +### Slash commands + +Three AI-assisted workflows are wired up for Claude Code operators +(`.claude/commands/`) and Copilot operators (`.github/prompts/`): +`/test` (run + interpret), `/diagnose` (read-only health report), `/repro` +(flake triage, N-times re-run with log diff). + +### House rules (for human + agent contributors) + +- Session-scoped fixtures in `tests/conftest.py` snapshot + restore + `userPrefs.jsonc`; **never edit `userPrefs.jsonc` from inside a test**. + Use the `test_profile` / `no_region_profile` fixtures for ephemeral + overrides. +- `SerialInterface` holds an **exclusive port lock**; sequence calls + open → mutate → close, then next device. No parallel calls to the + same port. +- Directed PKI-encrypted sends need **bilateral NodeInfo warmup** — + both sides must hold the other's current pubkey. See + `tests/mesh/_receive.py::nudge_nodeinfo_port` and the three directed- + send tests (`test_direct_with_ack`, `test_traceroute`, + `test_telemetry_request_reply`) for the canonical pattern. + +## Layout + +```text +mcp-server/ +├── pyproject.toml +├── README.md +└── src/meshtastic_mcp/ + ├── __main__.py # entry: python -m meshtastic_mcp + ├── server.py # FastMCP app + @app.tool() registrations (thin) + ├── config.py # firmware_root, pio_bin, esptool_bin, etc. + ├── pio.py # subprocess wrapper (timeouts, JSON, tail_lines) + ├── devices.py # list_devices (findPorts + comports) + ├── boards.py # list_boards / get_board (pio project config parse + cache) + ├── flash.py # build, clean, flash, erase_and_flash, update_flash, touch_1200bps + ├── serial_session.py # SerialSession + reader thread + ring buffer + ├── registry.py # session registry + per-port locks + ├── connection.py # connect(port) ctx mgr — SerialInterface + port lock + ├── info.py # device_info, list_nodes + ├── admin.py # set_owner, get/set_config, channels, send_text, reboot/shutdown/factory_reset + └── hw_tools.py # esptool / nrfutil / picotool wrappers +``` + +## Troubleshooting + +- **"Could not locate Meshtastic firmware root"** — set `MESHTASTIC_FIRMWARE_ROOT`. +- **"Could not find `pio`"** — install PlatformIO or set `MESHTASTIC_PIO_BIN`. +- **"Port is held by serial session ..."** — call `serial_close(session_id)` or `serial_list` to find it. +- **`factory.bin` not found after build** — the env may not be ESP32; only ESP32 envs produce a `.factory.bin`. +- **`touch_1200bps` reported `new_port: null`** — the device may not have 1200bps-reset stdio, or the bootloader re-uses the same port name. Check `list_devices` manually. diff --git a/mcp-server/pyproject.toml b/mcp-server/pyproject.toml new file mode 100644 index 00000000000..3241c843f4e --- /dev/null +++ b/mcp-server/pyproject.toml @@ -0,0 +1,54 @@ +[project] +name = "meshtastic-mcp" +version = "0.1.0" +description = "MCP server for Meshtastic firmware development: device discovery, PlatformIO tooling, flashing, serial monitoring, and device administration via the meshtastic Python API." +readme = "README.md" +requires-python = ">=3.11" +license = { text = "GPL-3.0-only" } +authors = [{ name = "thebentern" }] +dependencies = ["mcp>=1.2", "pyserial>=3.5", "meshtastic>=2.7.8"] + +[project.optional-dependencies] +dev = ["pytest>=7"] +test = [ + "pytest>=8", + "pytest-html>=4", + "pytest-reportlog>=0.4", + "pytest-timeout>=2.3", + "coverage[toml]>=7", + "pyyaml>=6", + # textual is required by the `meshtastic-mcp-test-tui` script (see + # `src/meshtastic_mcp/cli/test_tui.py`). Bundled into `test` rather than a + # separate `[tui]` extra because v1 expects test operators are the only + # consumers; revisit if install cost pushes back. + "textual>=0.50", +] +# UI test tier + `capture_screen` MCP tool. Optional because the ML OCR +# model alone is ~100 MB and camera hardware is user-supplied. +# pip install -e '.[ui]' — full (OpenCV + easyocr) +# pip install -e '.[ui-min]' — image capture only, no OCR +ui = [ + "opencv-python-headless>=4.9", + "numpy>=1.26", + "easyocr>=1.7", + "Pillow>=10.0", + # Renders the latest camera capture as Unicode half-blocks in the TUI + # (MESHTASTIC_UI_TUI_CAMERA=1). Terminal-agnostic — no kitty / sixel + # dependency. Pure Python, tiny. + "rich-pixels>=3.0", +] +ui-min = ["opencv-python-headless>=4.9", "numpy>=1.26"] + +[project.scripts] +meshtastic-mcp = "meshtastic_mcp.__main__:main" +# Live TUI wrapping run-tests.sh — shells out to the same script the plain +# CLI uses, tails pytest-reportlog for per-test state, and polls the device +# list at startup + post-run (port lock forces it to stay idle during the run). +meshtastic-mcp-test-tui = "meshtastic_mcp.cli.test_tui:main" + +[build-system] +requires = ["hatchling"] +build-backend = "hatchling.build" + +[tool.hatch.build.targets.wheel] +packages = ["src/meshtastic_mcp"] diff --git a/mcp-server/run-tests.sh b/mcp-server/run-tests.sh new file mode 100755 index 00000000000..c84a8f75153 --- /dev/null +++ b/mcp-server/run-tests.sh @@ -0,0 +1,270 @@ +#!/usr/bin/env bash +# mcp-server hardware test runner. +# +# Auto-detects connected Meshtastic devices, maps each to its PlatformIO env +# via the same role table the pytest fixtures use, exports the right +# MESHTASTIC_MCP_ENV_* env vars, and invokes pytest. +# +# Usage: +# ./run-tests.sh # full suite, default pytest args +# ./run-tests.sh tests/mesh # subset (any pytest args pass through) +# ./run-tests.sh --force-bake # override one default with another +# MESHTASTIC_MCP_ENV_NRF52=foo ./run-tests.sh # override env per role +# MESHTASTIC_MCP_SEED=ci-run-42 ./run-tests.sh # override PSK seed +# +# If zero supported devices are detected, only the unit tier runs. +# +# Also restores `userPrefs.jsonc` from the session-backup sidecar if a prior +# run exited abnormally (belt to conftest.py's atexit suspenders). + +set -euo pipefail + +# cd to the script's directory so relative paths resolve consistently no +# matter where the user invoked from. +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +cd "$SCRIPT_DIR" + +VENV_PY="$SCRIPT_DIR/.venv/bin/python" +if [[ ! -x $VENV_PY ]]; then + echo "error: $VENV_PY not found or not executable." >&2 + echo " Bootstrap the venv first:" >&2 + echo " cd $SCRIPT_DIR && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'" >&2 + exit 2 +fi + +# Resolve firmware root the same way conftest.py does (this script sits in +# mcp-server/, firmware repo root is one level up). +FIRMWARE_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +USERPREFS_PATH="$FIRMWARE_ROOT/userPrefs.jsonc" +USERPREFS_SIDECAR="$USERPREFS_PATH.mcp-session-bak" + +# ---------- Pre-flight: recover stale userPrefs.jsonc from prior crash ---- +# If conftest.py's atexit hook didn't fire (SIGKILL, kernel panic, OS +# restart), the sidecar is the ground truth. Self-heal before running so we +# don't bake the previous run's dirty state into this run's firmware. +if [[ -f $USERPREFS_SIDECAR ]]; then + echo "[pre-flight] found $USERPREFS_SIDECAR from a prior abnormal exit;" >&2 + echo " restoring userPrefs.jsonc before starting." >&2 + cp "$USERPREFS_SIDECAR" "$USERPREFS_PATH" + rm -f "$USERPREFS_SIDECAR" +fi + +# If userPrefs.jsonc has uncommitted changes BEFORE the run starts, that's +# worth warning about — tests will snapshot this dirty state and restore to +# it at the end, which may not be what the operator wants. +if command -v git >/dev/null 2>&1; then + cd "$FIRMWARE_ROOT" + # Capture the git status into a local first — SC2312 flags command + # substitution inside `[[ -n ... ]]` because the exit code of `git + # status` is masked. A two-step assignment makes the failure path + # explicit (non-git, missing file) and keeps the bracket test clean. + _git_status_porcelain="$(git status --porcelain userPrefs.jsonc 2>/dev/null || true)" + if [[ -n $_git_status_porcelain ]]; then + echo "[pre-flight] warning: userPrefs.jsonc has uncommitted changes." >&2 + echo " Tests will snapshot THIS state and restore to it" >&2 + echo " at teardown. If that's not intended, run:" >&2 + echo " git checkout userPrefs.jsonc" >&2 + echo " and re-invoke." >&2 + fi + cd "$SCRIPT_DIR" +fi + +# ---------- Seed default -------------------------------------------------- +# Per-machine default so repeated runs from the same operator land on the +# same PSK (makes --assume-baked valid across invocations). Operator can +# override with an explicit env var if they want isolation (e.g. CI). +if [[ -z ${MESHTASTIC_MCP_SEED-} ]]; then + WHO="$(whoami 2>/dev/null || echo anon)" + HOST="$(hostname -s 2>/dev/null || echo host)" + export MESHTASTIC_MCP_SEED="mcp-${WHO}-${HOST}" +fi + +# ---------- Flash progress log -------------------------------------------- +# pio.py / hw_tools.py tee subprocess output (pio run -t upload, esptool, +# nrfutil, picotool) to this file line-by-line as it arrives when this env +# var is set. The TUI tails it so the operator sees live flash progress +# instead of 3 minutes of silence during `test_00_bake.py`. Plain CLI users +# also benefit — the log is a post-run diagnostic even without the TUI. +# Truncate at session start so each run gets a clean log. +export MESHTASTIC_MCP_FLASH_LOG="$SCRIPT_DIR/tests/flash.log" +: >"$MESHTASTIC_MCP_FLASH_LOG" + +# ---------- Detect connected hardware ------------------------------------- +# In-process call to the same Python API the test fixtures use, so the +# script never drifts from what pytest sees. Returns a JSON object +# {role: port, ...}. +ROLES_JSON="$( + "$VENV_PY" - <<'PY' +import json +import sys + +sys.path.insert(0, "src") +from meshtastic_mcp import devices + +# Role → canonical VID map. Kept in sync with +# `tests/conftest.py::hub_profile` defaults; if that changes, this must too. +ROLE_BY_VID = { + 0x239A: "nrf52", # Adafruit / RAK nRF52 native USB (app + DFU) + 0x303A: "esp32s3", # Espressif native USB (ESP32-S3) + 0x10C4: "esp32s3", # CP2102 USB-UART (common on Heltec/LilyGO ESP32 boards) +} + +out: dict[str, str] = {} +for dev in devices.list_devices(include_unknown=True): + vid_raw = dev.get("vid") or "" + try: + if isinstance(vid_raw, str) and vid_raw.startswith("0x"): + vid = int(vid_raw, 16) + else: + vid = int(vid_raw) + except (TypeError, ValueError): + continue + role = ROLE_BY_VID.get(vid) + # First port wins per role — matches hub_devices fixture semantics. + if role and role not in out: + out[role] = dev["port"] + +json.dump(out, sys.stdout) +PY +)" + +# ---------- Map role → pio env -------------------------------------------- +# Honor MESHTASTIC_MCP_ENV_ operator overrides; fall back to the +# same defaults hardcoded in tests/conftest.py::_DEFAULT_ROLE_ENVS. +resolve_env() { + local role="$1" + local default="$2" + local upper + upper="$(echo "$role" | tr '[:lower:]' '[:upper:]')" + local var="MESHTASTIC_MCP_ENV_${upper}" + eval "local override=\${$var:-}" + if [[ -n $override ]]; then + echo "$override" + else + echo "$default" + fi +} + +NRF52_PORT="$(echo "$ROLES_JSON" | "$VENV_PY" -c 'import json,sys; print(json.loads(sys.stdin.read()).get("nrf52", ""))')" +ESP32S3_PORT="$(echo "$ROLES_JSON" | "$VENV_PY" -c 'import json,sys; print(json.loads(sys.stdin.read()).get("esp32s3", ""))')" + +DETECTED="" +if [[ -n $NRF52_PORT ]]; then + NRF52_ENV="$(resolve_env nrf52 rak4631)" + export MESHTASTIC_MCP_ENV_NRF52="$NRF52_ENV" + DETECTED="${DETECTED} nrf52 @ ${NRF52_PORT} -> env=${NRF52_ENV}\n" +fi +if [[ -n $ESP32S3_PORT ]]; then + ESP32S3_ENV="$(resolve_env esp32s3 heltec-v3)" + export MESHTASTIC_MCP_ENV_ESP32S3="$ESP32S3_ENV" + DETECTED="${DETECTED} esp32s3 @ ${ESP32S3_PORT} -> env=${ESP32S3_ENV}\n" +fi + +# ---------- Pre-flight summary -------------------------------------------- +# Surface what pytest is about to do with respect to the bake phase: the +# operator should see "will verify + bake if needed" by default, so a +# 3-minute flash appearing mid-run isn't a surprise. Detection of the +# explicit overrides is best-effort — we just scan $@ for the known flags. +_bake_mode="auto (verify + bake if needed)" +for _arg in "$@"; do + case "$_arg" in + --assume-baked) _bake_mode="skip (--assume-baked)" ;; + --force-bake) _bake_mode="force (--force-bake)" ;; + *) ;; # any other arg: pass-through; bake mode unchanged + esac +done + +echo "mcp-server test runner" +echo " firmware root : $FIRMWARE_ROOT" +echo " seed : $MESHTASTIC_MCP_SEED" +echo " bake : $_bake_mode" +if [[ -n $DETECTED ]]; then + echo " detected hub :" + printf "%b" "$DETECTED" +else + echo " detected hub : (none)" +fi +echo + +# ---------- Invoke pytest ------------------------------------------------- +# If no devices detected, only the unit tier would produce meaningful +# PASS/FAIL — every hardware test would SKIP with "role not present". We +# narrow to tests/unit explicitly so the summary reads as "no hardware, +# unit suite only" instead of "big skip count looks suspicious". +if [[ -z $DETECTED && $# -eq 0 ]]; then + echo "[pre-flight] no supported devices detected; running unit tier only." + echo + exec "$VENV_PY" -m pytest tests/unit -v --report-log=tests/reportlog.jsonl +fi + +# Default pytest args when the user passed none. Power users can invoke +# `./run-tests.sh tests/mesh -v --tb=long` and skip all of these defaults. +# +# NOTE: `--assume-baked` is DELIBERATELY omitted here. `tests/test_00_bake.py` +# has an internal skip-if-already-baked check (`_bake_role`: query device_info, +# compare region + primary_channel to the session profile, skip on match). +# So the fast path is ~8-10 s of verification overhead when the devices are +# already baked — negligible next to the 2-6 min suite runtime. Letting +# test_00_bake.py run means a fresh device, a re-seeded session, or a post- +# factory-reset device gets flashed automatically instead of silently +# skipping half the hardware tests with "not baked with session profile" +# errors. Power users who know their hardware is current and want to shave +# those seconds can pass `--assume-baked` explicitly. +if [[ $# -eq 0 ]]; then + set -- tests/ \ + --html=tests/report.html --self-contained-html \ + --junitxml=tests/junit.xml \ + -v --tb=short +fi + +# UI tier requires opencv-python-headless (and ideally easyocr). If it's +# not installed, auto-deselect tests/ui so operators without the [ui] +# extra still get a green run. Printed in yellow; silent when cv2 is +# present. +_cv2_ok=0 +if "$VENV_PY" -c "import cv2" >/dev/null 2>&1; then + _cv2_ok=1 +fi +_running_ui=0 +for _arg in "$@"; do + case "$_arg" in + *tests/ui* | tests/) _running_ui=1 ;; + *) ;; + esac +done +if [[ $_running_ui -eq 1 && $_cv2_ok -eq 0 ]]; then + printf '\033[33m[pre-flight] tests/ui tier detected, but opencv-python-headless is not installed — deselecting.\033[0m\n' + printf ' install with: .venv/bin/pip install -e "mcp-server/.[ui]"\n' + echo + set -- "$@" --ignore=tests/ui +fi + +# Recovery tier needs `uhubctl` on PATH — it power-cycles devices via USB +# hub PPPS. The tier's conftest already skips cleanly, so this is just a +# friendly heads-up before the skip happens. `baked_single`'s auto- +# recovery hook also benefits from having uhubctl available across the +# whole suite. +if ! command -v uhubctl >/dev/null 2>&1; then + printf "\033[33m[pre-flight] uhubctl not found on PATH — recovery tier will skip, and\n" + printf " wedged-device auto-recovery is disabled.\033[0m\n" + printf " install with: brew install uhubctl (macOS) or apt install uhubctl (Debian/Ubuntu).\n" + echo +fi + +# Always emit `tests/reportlog.jsonl` (unless the operator explicitly passed +# their own `--report-log=...`). Consumers — notably the +# `meshtastic-mcp-test-tui` TUI — tail the reportlog for live per-test state. +# Appending here means power-user invocations like `./run-tests.sh tests/mesh` +# also produce it, not just the all-defaults invocation. +_has_report_log=0 +for _arg in "$@"; do + case "$_arg" in + --report-log | --report-log=*) _has_report_log=1 ;; + *) ;; # any other arg: no-op; loop continues + esac +done +if [[ $_has_report_log -eq 0 ]]; then + set -- "$@" --report-log=tests/reportlog.jsonl +fi + +exec "$VENV_PY" -m pytest "$@" diff --git a/mcp-server/scripts/datadog-dashboard.json b/mcp-server/scripts/datadog-dashboard.json new file mode 100644 index 00000000000..73aa3520132 --- /dev/null +++ b/mcp-server/scripts/datadog-dashboard.json @@ -0,0 +1,217 @@ +{ + "title": "Meshtastic Firmware — Recorder Stream", + "description": "Live view of `.mtlog/` streams shipped by `mtlog_to_datadog.py`. Heap, packet volume, log levels, errors. One row per port.", + "widgets": [ + { + "definition": { + "title": "Free heap (bytes)", + "type": "timeseries", + "show_legend": true, + "requests": [ + { + "queries": [ + { + "name": "free_heap", + "data_source": "metrics", + "query": "avg:mesh.local.heap_free_bytes{service:meshtastic-firmware} by {port}" + } + ], + "response_format": "timeseries", + "display_type": "line" + } + ], + "yaxis": { "label": "bytes" } + } + }, + { + "definition": { + "title": "Heap slope (bytes/min) — last 1h", + "type": "query_value", + "precision": 0, + "requests": [ + { + "queries": [ + { + "name": "slope", + "data_source": "metrics", + "query": "derivative(avg:mesh.local.heap_free_bytes{service:meshtastic-firmware})", + "aggregator": "avg" + } + ], + "response_format": "scalar" + } + ], + "conditional_formats": [ + { "comparator": "<", "value": -100, "palette": "white_on_red" }, + { "comparator": "<", "value": 0, "palette": "white_on_yellow" }, + { "comparator": ">=", "value": 0, "palette": "white_on_green" } + ] + } + }, + { + "definition": { + "title": "Total heap (bytes)", + "type": "timeseries", + "requests": [ + { + "queries": [ + { + "name": "total_heap", + "data_source": "metrics", + "query": "avg:mesh.local.heap_total_bytes{service:meshtastic-firmware} by {port}" + } + ], + "response_format": "timeseries", + "display_type": "line" + } + ] + } + }, + { + "definition": { + "title": "Battery level (%)", + "type": "timeseries", + "requests": [ + { + "queries": [ + { + "name": "battery", + "data_source": "metrics", + "query": "avg:mesh.device.battery_level{service:meshtastic-firmware} by {port}" + } + ], + "response_format": "timeseries", + "display_type": "line" + } + ], + "yaxis": { "min": "0", "max": "105" } + } + }, + { + "definition": { + "title": "Air utilization (TX %)", + "type": "timeseries", + "requests": [ + { + "queries": [ + { + "name": "airutil", + "data_source": "metrics", + "query": "avg:mesh.device.air_util_tx{service:meshtastic-firmware} by {port}" + } + ], + "response_format": "timeseries", + "display_type": "line" + } + ] + } + }, + { + "definition": { + "title": "Channel utilization (%)", + "type": "timeseries", + "requests": [ + { + "queries": [ + { + "name": "chutil", + "data_source": "metrics", + "query": "avg:mesh.device.channel_utilization{service:meshtastic-firmware} by {port}" + } + ], + "response_format": "timeseries", + "display_type": "line" + } + ] + } + }, + { + "definition": { + "title": "Log volume by level", + "type": "timeseries", + "show_legend": true, + "requests": [ + { + "response_format": "timeseries", + "display_type": "bars", + "queries": [ + { + "name": "log_count", + "data_source": "logs", + "indexes": ["*"], + "compute": { "aggregation": "count" }, + "search": { "query": "service:meshtastic-firmware" }, + "group_by": [ + { + "facet": "@level", + "limit": 10, + "sort": { "order": "desc", "aggregation": "count" } + } + ] + } + ] + } + ] + } + }, + { + "definition": { + "title": "Recent ERROR / CRIT firmware logs", + "type": "list_stream", + "requests": [ + { + "response_format": "event_list", + "query": { + "data_source": "logs_stream", + "query_string": "service:meshtastic-firmware (status:error OR @level:ERROR OR @level:CRIT)", + "indexes": [], + "sort": { "column": "timestamp", "order": "desc" } + }, + "columns": [ + { "field": "timestamp", "width": "auto" }, + { "field": "host", "width": "auto" }, + { "field": "@port", "width": "auto" }, + { "field": "@level", "width": "auto" }, + { "field": "@thread", "width": "auto" }, + { "field": "message", "width": "stretch" } + ] + } + ] + } + }, + { + "definition": { + "title": "Recorder marker events", + "type": "list_stream", + "requests": [ + { + "response_format": "event_list", + "query": { + "data_source": "logs_stream", + "query_string": "service:meshtastic-firmware @level:MARK", + "indexes": [], + "sort": { "column": "timestamp", "order": "desc" } + }, + "columns": [ + { "field": "timestamp", "width": "auto" }, + { "field": "host", "width": "auto" }, + { "field": "message", "width": "stretch" } + ] + } + ] + } + } + ], + "template_variables": [ + { + "name": "port", + "prefix": "port", + "available_values": [], + "default": "*" + }, + { "name": "host", "prefix": "host", "available_values": [], "default": "*" } + ], + "layout_type": "ordered", + "notify_list": [], + "reflow_type": "auto" +} diff --git a/mcp-server/scripts/mtlog_to_datadog.py b/mcp-server/scripts/mtlog_to_datadog.py new file mode 100755 index 00000000000..51496adc439 --- /dev/null +++ b/mcp-server/scripts/mtlog_to_datadog.py @@ -0,0 +1,389 @@ +#!/usr/bin/env python3 +"""Forward selected recorder JSONL streams to Datadog. + +Reads `.mtlog/logs.jsonl` and `.mtlog/telemetry.jsonl`, ships logs to the +Logs Intake API and telemetry numerics to the Metrics v2 series API. +Resumes from `.mtlog/.dd-cursor.json` so a daemon restart doesn't +duplicate rows already shipped from the current live files. + +This forwarder does not currently backfill rotated `.jsonl.gz` archives. +If the recorder rotates before this process drains the live file, or the +forwarder is down across a rotation, those older rows are skipped. + +Usage: + DD_API_KEY=... ./scripts/mtlog_to_datadog.py --tail + ./scripts/mtlog_to_datadog.py --once # catch up + exit + ./scripts/mtlog_to_datadog.py --since 3600 # backfill last hour from start + +Default `DD_SITE` is `us5.datadoghq.com` — the team's Datadog instance. +Override via `DD_SITE=...` env var or `--site` flag for one-offs. + +The forwarder is a separate process by design — a Datadog outage or +auth failure must not backpressure the recorder. We exit non-zero on +fatal config errors (missing API key) and keep retrying on transient +network/HTTP errors. +""" + +from __future__ import annotations + +import argparse +import json +import os +import socket +import sys +import time +from pathlib import Path +from typing import Any, Iterator + +try: + import requests +except ImportError: + print( + "requests is required. Install it in the mcp-server venv: " + "uv pip install requests", + file=sys.stderr, + ) + sys.exit(2) + + +_DEFAULT_LOG_DIR = Path(__file__).resolve().parents[1] / ".mtlog" +_LOG_INTAKE_TPL = "https://http-intake.logs.{site}/api/v2/logs" +_METRICS_TPL = "https://api.{site}/api/v2/series" +_LOG_BATCH = 50 +_METRICS_BATCH = 100 +_MAX_RETRIES = 5 +_RETRY_BASE_S = 1.5 + + +# --- streaming JSONL with byte-position cursor ------------------------- + + +class _StreamReader: + """Reads a single rotating JSONL with cursor-based resume. + + This tails only the live `.jsonl` file. The recorder rotates files + (live `.jsonl` → `.YYYYMMDD-HHMMSS-uuuuuu-NNNNN.jsonl.gz`), which means + the live file shrinks abruptly. We detect that via inode change OR live + size < cursor position, and reset the live-file cursor to 0. + """ + + def __init__(self, path: Path, cursor: dict[str, Any]): + self.path = path + self.cursor = cursor + + def _state(self) -> tuple[int, int]: + """Return (inode, size) for the live file. (0, 0) if missing.""" + try: + st = self.path.stat() + return (st.st_ino, st.st_size) + except FileNotFoundError: + return (0, 0) + + def iter_new_records(self) -> Iterator[dict[str, Any]]: + ino, size = self._state() + last_ino = self.cursor.get("ino") + last_pos = int(self.cursor.get("pos") or 0) + if ino == 0: + return + if last_ino is not None and last_ino != ino: + # Rotation happened. Start over. + last_pos = 0 + if last_pos > size: + # Live file truncated/shrunk under us — recorder rotated. + last_pos = 0 + try: + with self.path.open("r", encoding="utf-8") as fh: + fh.seek(last_pos) + for line in fh: + line = line.rstrip("\n") + if not line: + continue + try: + yield json.loads(line) + except json.JSONDecodeError: + continue + last_pos = fh.tell() + except FileNotFoundError: + return + self.cursor["ino"] = ino + self.cursor["pos"] = last_pos + + +def _load_cursor(path: Path) -> dict[str, Any]: + if not path.exists(): + return {} + try: + return json.loads(path.read_text()) + except (OSError, json.JSONDecodeError): + return {} + + +def _save_cursor(path: Path, data: dict[str, Any]) -> None: + tmp = path.with_suffix(".json.tmp") + tmp.write_text(json.dumps(data, separators=(",", ":"))) + tmp.replace(path) + + +# --- Datadog clients --------------------------------------------------- + + +class _DDSession: + """Pool one HTTPS session, share retry logic.""" + + def __init__(self, api_key: str, site: str, hostname: str) -> None: + self.api_key = api_key + self.site = site + self.hostname = hostname + self.session = requests.Session() + self.session.headers.update( + { + "DD-API-KEY": api_key, + "Content-Type": "application/json", + } + ) + + def _post(self, url: str, payload: Any) -> bool: + for attempt in range(_MAX_RETRIES): + try: + resp = self.session.post(url, json=payload, timeout=30) + except requests.RequestException as e: + _wait_retry(attempt, f"network error: {e}") + continue + if 200 <= resp.status_code < 300: + return True + if resp.status_code in (408, 429, 500, 502, 503, 504): + _wait_retry( + attempt, + f"HTTP {resp.status_code} retrying", + ) + continue + print( + f"datadog refused: {resp.status_code} {resp.text[:200]}", + file=sys.stderr, + ) + return False + return False + + def send_logs(self, records: list[dict[str, Any]]) -> int: + if not records: + return 0 + url = _LOG_INTAKE_TPL.format(site=self.site) + sent = 0 + for i in range(0, len(records), _LOG_BATCH): + batch = records[i : i + _LOG_BATCH] + if self._post(url, batch): + sent += len(batch) + return sent + + def send_metrics(self, series: list[dict[str, Any]]) -> int: + if not series: + return 0 + url = _METRICS_TPL.format(site=self.site) + sent = 0 + for i in range(0, len(series), _METRICS_BATCH): + batch = series[i : i + _METRICS_BATCH] + if self._post(url, {"series": batch}): + sent += len(batch) + return sent + + +def _wait_retry(attempt: int, reason: str) -> None: + wait = _RETRY_BASE_S * (2**attempt) + print( + f" retry {attempt + 1}/{_MAX_RETRIES} in {wait:.1f}s ({reason})", + file=sys.stderr, + ) + time.sleep(wait) + + +# --- record → datadog payload ------------------------------------------ + + +def _log_record_to_dd(rec: dict[str, Any], host: str) -> dict[str, Any]: + line = rec.get("line") or "" + tags = [ + f"role:{rec.get('role')}", + f"port:{rec.get('port')}", + ] + level = rec.get("level") + if level: + tags.append(f"level:{level}") + tag = rec.get("tag") + if tag: + tags.append(f"thread:{tag}") + return { + "ddsource": "meshtastic-firmware", + "service": "meshtastic-firmware", + "hostname": host, + "message": line, + "ddtags": ",".join(t for t in tags if t and "None" not in t), + "timestamp": int((rec.get("ts") or time.time()) * 1000), + "level": level, + } + + +def _telemetry_record_to_metrics( + rec: dict[str, Any], host: str +) -> list[dict[str, Any]]: + fields = rec.get("fields") or {} + if not isinstance(fields, dict): + return [] + variant = rec.get("variant") or "unknown" + ts = int(rec.get("ts") or time.time()) + out: list[dict[str, Any]] = [] + tags = [] + if rec.get("port"): + tags.append(f"port:{rec['port']}") + if rec.get("role"): + tags.append(f"role:{rec['role']}") + if rec.get("from_node"): + tags.append(f"from_node:{rec['from_node']}") + tags.append(f"variant:{variant}") + for field, value in fields.items(): + if not isinstance(value, (int, float)) or isinstance(value, bool): + continue + metric = f"mesh.{variant}.{_metric_safe(field)}" + out.append( + { + "metric": metric, + "type": 3, # GAUGE + "points": [{"timestamp": ts, "value": float(value)}], + "tags": tags, + "resources": [{"type": "host", "name": host}], + } + ) + return out + + +def _metric_safe(name: str) -> str: + # Lowercase, replace non-alnum with underscore for safe metric names. + return "".join(c.lower() if c.isalnum() else "_" for c in name) + + +# --- main loop --------------------------------------------------------- + + +def run( + log_dir: Path, + *, + once: bool, + since_seconds: float | None, + poll_interval: float, + dd: _DDSession, +) -> int: + cursor_path = log_dir / ".dd-cursor.json" + cursors = _load_cursor(cursor_path) + + # `--since` overrides cursor: rewind to (now-since) timestamp. + # We can't seek by timestamp directly (cursor is byte position), so + # we just reset cursors to 0 and let the time filter in iter_new + # drop older records. + cutoff_ts: float | None = None + if since_seconds is not None: + cursors = {} + cutoff_ts = time.time() - since_seconds + + sent_total = {"logs": 0, "telemetry": 0} + + while True: + # logs.jsonl → DD logs + log_cursor = cursors.setdefault("logs", {}) + log_batch: list[dict[str, Any]] = [] + for rec in _StreamReader(log_dir / "logs.jsonl", log_cursor).iter_new_records(): + if cutoff_ts and (rec.get("ts") or 0) < cutoff_ts: + continue + log_batch.append(_log_record_to_dd(rec, dd.hostname)) + if log_batch: + n = dd.send_logs(log_batch) + sent_total["logs"] += n + print(f"logs: sent {n}/{len(log_batch)}") + + # telemetry.jsonl → DD metrics + telem_cursor = cursors.setdefault("telemetry", {}) + metric_series: list[dict[str, Any]] = [] + for rec in _StreamReader( + log_dir / "telemetry.jsonl", telem_cursor + ).iter_new_records(): + if cutoff_ts and (rec.get("ts") or 0) < cutoff_ts: + continue + metric_series.extend(_telemetry_record_to_metrics(rec, dd.hostname)) + if metric_series: + n = dd.send_metrics(metric_series) + sent_total["telemetry"] += n + print(f"telemetry: sent {n}/{len(metric_series)} metric points") + + _save_cursor(cursor_path, cursors) + + if once: + print(f"done. logs={sent_total['logs']} metrics={sent_total['telemetry']}") + return 0 + time.sleep(poll_interval) + + +def main(argv: list[str] | None = None) -> int: + parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument( + "--log-dir", + default=str(_DEFAULT_LOG_DIR), + help="Path to .mtlog/ (default: mcp-server/.mtlog)", + ) + mode = parser.add_mutually_exclusive_group() + mode.add_argument("--once", action="store_true", help="Catch up then exit") + mode.add_argument( + "--tail", + action="store_true", + help="Daemon: poll forever (default)", + ) + parser.add_argument( + "--since", + type=float, + default=None, + help="Backfill last N seconds. Resets cursor.", + ) + parser.add_argument( + "--poll-interval", + type=float, + default=5.0, + help="Seconds between tail polls (default 5)", + ) + parser.add_argument( + "--site", + default=os.environ.get("DD_SITE", "us5.datadoghq.com"), + help=( + "Datadog site. Default is the team's instance (us5.datadoghq.com). " + "Override via DD_SITE env var or this flag." + ), + ) + parser.add_argument( + "--host", + default=socket.gethostname(), + help="Hostname tag (default: socket.gethostname())", + ) + args = parser.parse_args(argv) + + api_key = os.environ.get("DD_API_KEY") + if not api_key: + print("DD_API_KEY env var required.", file=sys.stderr) + return 2 + + log_dir = Path(args.log_dir) + if not log_dir.exists(): + print( + f"log dir {log_dir} does not exist — start the mcp-server first.", + file=sys.stderr, + ) + return 2 + + dd = _DDSession(api_key=api_key, site=args.site, hostname=args.host) + once = args.once and not args.tail + return run( + log_dir, + once=once, + since_seconds=args.since, + poll_interval=args.poll_interval, + dd=dd, + ) + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/mcp-server/src/meshtastic_mcp/__init__.py b/mcp-server/src/meshtastic_mcp/__init__.py new file mode 100644 index 00000000000..bd696afe01d --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/__init__.py @@ -0,0 +1,3 @@ +"""Meshtastic MCP server — device discovery, PlatformIO tooling, and device admin.""" + +__version__ = "0.1.0" diff --git a/mcp-server/src/meshtastic_mcp/__main__.py b/mcp-server/src/meshtastic_mcp/__main__.py new file mode 100644 index 00000000000..4ed67db3821 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/__main__.py @@ -0,0 +1,11 @@ +"""Entry point for `python -m meshtastic_mcp`.""" + +from meshtastic_mcp.server import app + + +def main() -> None: + app.run() + + +if __name__ == "__main__": + main() diff --git a/mcp-server/src/meshtastic_mcp/admin.py b/mcp-server/src/meshtastic_mcp/admin.py new file mode 100644 index 00000000000..33f3865dd68 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/admin.py @@ -0,0 +1,417 @@ +"""Device administration: owner, config, channels, messaging, admin actions. + +All operations use the same `connect()` context manager so port selection, +port-busy detection, and cleanup are handled uniformly. + +Config writes use a dot-path: the first segment names a section (e.g. +`"lora"` in LocalConfig or `"mqtt"` in LocalModuleConfig), remaining segments +walk protobuf fields. Enum fields accept their string names (`"US"` for +`lora.region`) so callers don't need to know the numeric values. +""" + +from __future__ import annotations + +from typing import Any + +from google.protobuf import descriptor as pb_descriptor +from google.protobuf import json_format +from meshtastic.protobuf import localonly_pb2 + +from .connection import connect + + +class AdminError(RuntimeError): + pass + + +LOCAL_CONFIG_SECTIONS = {f.name for f in localonly_pb2.LocalConfig.DESCRIPTOR.fields} +MODULE_CONFIG_SECTIONS = { + f.name for f in localonly_pb2.LocalModuleConfig.DESCRIPTOR.fields +} + + +def _require_confirm(confirm: bool, operation: str) -> None: + if not confirm: + raise AdminError(f"{operation} is destructive and requires confirm=True.") + + +def _message_to_dict(msg: Any) -> dict[str, Any]: + # `including_default_value_fields` was renamed to + # `always_print_fields_with_no_presence` in protobuf 5.26+. Pick whichever + # kwarg the installed version accepts so we work against both. + kwargs: dict[str, Any] = {"preserving_proto_field_name": True} + import inspect + + sig = inspect.signature(json_format.MessageToDict) + if "always_print_fields_with_no_presence" in sig.parameters: + kwargs["always_print_fields_with_no_presence"] = False + elif "including_default_value_fields" in sig.parameters: + kwargs["including_default_value_fields"] = False + return json_format.MessageToDict(msg, **kwargs) + + +# ---------- owner ---------------------------------------------------------- + + +def set_owner( + long_name: str, + short_name: str | None = None, + port: str | None = None, +) -> dict[str, Any]: + if short_name is not None and len(short_name) > 4: + raise AdminError("short_name must be 4 characters or fewer") + with connect(port=port) as iface: + iface.localNode.setOwner(long_name=long_name, short_name=short_name) + return { + "ok": True, + "long_name": long_name, + "short_name": short_name, + } + + +# ---------- config reads --------------------------------------------------- + + +def _section_container(node, section: str) -> tuple[Any, str]: + """Return (container_message, parent_name) for a section name. + + Parent is 'localConfig' or 'moduleConfig' so callers know where to call + writeConfig() after mutating. + """ + if section in LOCAL_CONFIG_SECTIONS: + return getattr(node.localConfig, section), "localConfig" + if section in MODULE_CONFIG_SECTIONS: + return getattr(node.moduleConfig, section), "moduleConfig" + raise AdminError( + f"Unknown config section: {section!r}. " + f"Valid sections: {sorted(LOCAL_CONFIG_SECTIONS | MODULE_CONFIG_SECTIONS)}" + ) + + +def get_config(section: str | None = None, port: str | None = None) -> dict[str, Any]: + """Read one or all config sections. + + `section` may be any name in LocalConfig (device, lora, position, power, + network, display, bluetooth, security) or LocalModuleConfig (mqtt, serial, + telemetry, ...). Omit `section` or pass `"all"` for everything. + """ + with connect(port=port) as iface: + node = iface.localNode + if section in (None, "all"): + lc = _message_to_dict(node.localConfig) + mc = _message_to_dict(node.moduleConfig) + return { + "config": { + "localConfig": lc, + "moduleConfig": mc, + } + } + container, _parent = _section_container(node, section) + return {"config": {section: _message_to_dict(container)}} + + +# ---------- config writes -------------------------------------------------- + + +def _coerce_enum(field: pb_descriptor.FieldDescriptor, value: Any) -> int: + """Accept an enum value as either its int or its string name.""" + enum_type = field.enum_type + if isinstance(value, bool): + raise AdminError(f"{field.name}: expected enum {enum_type.name}, got bool") + if isinstance(value, int): + if enum_type.values_by_number.get(value) is None: + raise AdminError( + f"{field.name}: {value} is not a valid {enum_type.name} value" + ) + return value + if isinstance(value, str): + upper = value.upper() + ev = enum_type.values_by_name.get(upper) + if ev is None: + valid = sorted(enum_type.values_by_name.keys()) + raise AdminError( + f"{field.name}: {value!r} is not a valid {enum_type.name}. " + f"Valid: {valid}" + ) + return ev.number + raise AdminError( + f"{field.name}: expected enum {enum_type.name}, got {type(value).__name__}" + ) + + +def _coerce_scalar(field: pb_descriptor.FieldDescriptor, value: Any) -> Any: + t = field.type + FT = pb_descriptor.FieldDescriptor + if t == FT.TYPE_ENUM: + return _coerce_enum(field, value) + if t == FT.TYPE_BOOL: + if isinstance(value, bool): + return value + if isinstance(value, str): + return value.strip().lower() in ("true", "yes", "1", "on") + if isinstance(value, int): + return bool(value) + if t in ( + FT.TYPE_INT32, + FT.TYPE_INT64, + FT.TYPE_UINT32, + FT.TYPE_UINT64, + FT.TYPE_SINT32, + FT.TYPE_SINT64, + FT.TYPE_FIXED32, + FT.TYPE_FIXED64, + ): + return int(value) + if t in (FT.TYPE_FLOAT, FT.TYPE_DOUBLE): + return float(value) + if t == FT.TYPE_STRING: + return str(value) + if t == FT.TYPE_BYTES: + if isinstance(value, (bytes, bytearray)): + return bytes(value) + return str(value).encode("utf-8") + raise AdminError( + f"{field.name}: unsupported field type {t}. Use raw protobuf for this field." + ) + + +def _walk_to_field( + root_msg: Any, path_segments: list[str] +) -> tuple[Any, pb_descriptor.FieldDescriptor]: + """Walk `root_msg` by field names until the leaf; return (parent_msg, leaf_field_descriptor).""" + msg = root_msg + for i, name in enumerate(path_segments): + desc = msg.DESCRIPTOR + field = desc.fields_by_name.get(name) + if field is None: + trail = ".".join(path_segments[:i] or [""]) + valid = [f.name for f in desc.fields] + raise AdminError(f"No field {name!r} in {trail}. Valid: {valid}") + is_last = i == len(path_segments) - 1 + if is_last: + return msg, field + if field.type != pb_descriptor.FieldDescriptor.TYPE_MESSAGE: + raise AdminError( + f"{'.'.join(path_segments[:i+1])} is a scalar; cannot descend into it" + ) + msg = getattr(msg, name) + # path_segments was empty + raise AdminError("Empty config path") + + +def set_config(path: str, value: Any, port: str | None = None) -> dict[str, Any]: + """Set a single config field by dot-path and write it to the device. + + Examples: + set_config("lora.region", "US") + set_config("lora.modem_preset", "LONG_FAST") + set_config("device.role", "ROUTER") + set_config("mqtt.enabled", True) + set_config("mqtt.address", "mqtt.example.com") + + """ + segments = [s for s in path.split(".") if s] + if not segments: + raise AdminError("path cannot be empty") + section = segments[0] + + with connect(port=port) as iface: + node = iface.localNode + container, parent_name = _section_container(node, section) + + # Treat the section as the root; the rest of the path walks into it. + leaf_parent, field = _walk_to_field(container, segments[1:] or []) + # Use `is_repeated` (modern upb protobuf API) rather than the + # deprecated `label == LABEL_REPEATED` check — the C-extension + # FieldDescriptor in protobuf >= 5.x doesn't expose `.label` at + # all, and `is_repeated` is the supported replacement that works + # across both the pure-python and upb backends. + if field.is_repeated: + raise AdminError( + f"{path!r} is a repeated field; v1 only supports scalar sets. " + "Use the raw meshtastic CLI for now." + ) + old_raw = getattr(leaf_parent, field.name) + coerced = _coerce_scalar(field, value) + try: + setattr(leaf_parent, field.name, coerced) + except (TypeError, ValueError) as exc: + raise AdminError(f"{path}: {exc}") from exc + + node.writeConfig(section) + + # Stringify enums for the response (so the caller can see the change in + # the same vocabulary they used to set it). + if field.type == pb_descriptor.FieldDescriptor.TYPE_ENUM: + try: + old_display = field.enum_type.values_by_number[old_raw].name + new_display = field.enum_type.values_by_number[coerced].name + except Exception: + old_display, new_display = old_raw, coerced + else: + old_display, new_display = old_raw, coerced + + return { + "ok": True, + "path": path, + "section": section, + "parent": parent_name, + "old_value": old_display, + "new_value": new_display, + } + + +# ---------- channels ------------------------------------------------------- + + +def get_channel_url( + include_all: bool = False, port: str | None = None +) -> dict[str, Any]: + with connect(port=port) as iface: + url = iface.localNode.getURL(includeAll=include_all) + return {"url": url} + + +def set_channel_url(url: str, port: str | None = None) -> dict[str, Any]: + with connect(port=port) as iface: + # setURL replaces the channel set from the URL's contents. It does not + # return a count; we infer by counting non-DISABLED channels after. + iface.localNode.setURL(url) + channels = iface.localNode.channels or [] + active = sum(1 for c in channels if getattr(c, "role", 0) != 0) + return {"ok": True, "channels_imported": active} + + +# ---------- messaging ------------------------------------------------------ + + +def send_text( + text: str, + to: str | int | None = None, + channel_index: int = 0, + want_ack: bool = False, + port: str | None = None, +) -> dict[str, Any]: + destination = to if to is not None else "^all" + with connect(port=port) as iface: + packet = iface.sendText( + text, + destinationId=destination, + wantAck=want_ack, + channelIndex=channel_index, + ) + packet_id = getattr(packet, "id", None) + return {"ok": True, "packet_id": packet_id, "destination": destination} + + +# ---------- diagnostics ---------------------------------------------------- + + +def set_debug_log_api(enabled: bool, port: str | None = None) -> dict[str, Any]: + """Toggle `config.security.debug_log_api_enabled` on the local node. + + When enabled, firmware emits log lines as protobuf `LogRecord` messages + over the StreamAPI instead of raw text. meshtastic-python surfaces them + on pubsub topic `meshtastic.log.line`, which flows through the SAME + SerialInterface our tests already hold open — no `pio device monitor` + needed, no port-contention with admin/info calls. + + Firmware gate: `src/SerialConsole.cpp` (`usingProtobufs && + config.security.debug_log_api_enabled`). Setting persists in NVS; it + survives reboot. `factory_reset(full=False)` clears it unless it's + re-applied after reset. + + Previously-documented concurrency hazard (emitLogRecord sharing the + main packet-emission buffers) has been fixed — see `StreamAPI.h` + where the log path now owns dedicated `fromRadioScratchLog` / + `txBufLog` buffers, and `StreamAPI::emitTxBuffer` + + `StreamAPI::emitLogRecord` both serialize their `stream->write` + calls via `streamLock`. Leaving the flag on under traffic is safe. + """ + with connect(port=port) as iface: + sec = iface.localNode.localConfig.security + sec.debug_log_api_enabled = bool(enabled) + iface.localNode.writeConfig("security") + return {"ok": True, "debug_log_api_enabled": bool(enabled)} + + +# ---------- admin actions -------------------------------------------------- + + +def reboot( + port: str | None = None, confirm: bool = False, seconds: int = 10 +) -> dict[str, Any]: + _require_confirm(confirm, "reboot") + with connect(port=port) as iface: + iface.localNode.reboot(secs=seconds) + return {"ok": True, "rebooting_in_s": seconds} + + +def shutdown( + port: str | None = None, confirm: bool = False, seconds: int = 10 +) -> dict[str, Any]: + _require_confirm(confirm, "shutdown") + with connect(port=port) as iface: + iface.localNode.shutdown(secs=seconds) + return {"ok": True, "shutting_down_in_s": seconds} + + +def send_input_event( + event_code: int | str, + kb_char: int = 0, + touch_x: int = 0, + touch_y: int = 0, + port: str | None = None, +) -> dict[str, Any]: + """Inject an InputBroker event (button press / key / gesture) into the UI. + + Wraps `AdminMessage.send_input_event` (handled in firmware at + src/modules/AdminModule.cpp::handleSendInputEvent). Local-only — no PKI + warmup needed since the admin message is addressed to `my_node_num`. + + `event_code` accepts an int, a case-insensitive name + (`"RIGHT"` / `"input_broker_right"`), or an `InputEventCode`. The + firmware-side enum lives in src/input/InputBroker.h and is mirrored in + `meshtastic_mcp.input_events`. + """ + from meshtastic.protobuf import admin_pb2 # type: ignore[import-untyped] + + from .input_events import coerce_event_code + + code = coerce_event_code(event_code) + if not 0 <= kb_char <= 255: + raise ValueError(f"kb_char out of u8 range: {kb_char}") + if not 0 <= touch_x <= 65535: + raise ValueError(f"touch_x out of u16 range: {touch_x}") + if not 0 <= touch_y <= 65535: + raise ValueError(f"touch_y out of u16 range: {touch_y}") + + with connect(port=port) as iface: + msg = admin_pb2.AdminMessage() + msg.send_input_event.event_code = code + msg.send_input_event.kb_char = kb_char + msg.send_input_event.touch_x = touch_x + msg.send_input_event.touch_y = touch_y + iface.localNode._sendAdmin(msg) + return {"ok": True, "event_code": code, "kb_char": kb_char} + + +def factory_reset( + port: str | None = None, confirm: bool = False, full: bool = False +) -> dict[str, Any]: + """Tell the node to factory-reset its config. + + Works around a meshtastic-python 2.7.8 bug: `Node.factoryReset(full=True)` + internally does `p.factory_reset_config = True` where the field is + int32. protobuf 5.x rejects bool→int assignment as a TypeError. We build + the AdminMessage directly with int values (1=non-full, 2=full) and call + `_sendAdmin` to sidestep the SDK bug entirely. + """ + _require_confirm(confirm, "factory_reset") + from meshtastic.protobuf import admin_pb2 # type: ignore[import-untyped] + + with connect(port=port) as iface: + msg = admin_pb2.AdminMessage() + msg.factory_reset_config = 2 if full else 1 + iface.localNode._sendAdmin(msg) + return {"ok": True, "full": full} diff --git a/mcp-server/src/meshtastic_mcp/boards.py b/mcp-server/src/meshtastic_mcp/boards.py new file mode 100644 index 00000000000..df5024800a6 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/boards.py @@ -0,0 +1,159 @@ +"""Board / PlatformIO env enumeration. + +Parses `pio project config --json-output` — a nested list of +`[section_name, [[key, value], ...]]` pairs — into a dict keyed by env name, +extracting the `custom_meshtastic_*` metadata the firmware variants expose. + +The parsed config is cached and invalidated when `platformio.ini`'s mtime +changes, so subsequent calls don't pay the 1–2s pio startup cost. +""" + +from __future__ import annotations + +import threading +from typing import Any + +from . import config, pio + +_CACHE_LOCK = threading.Lock() +_CACHE: dict[str, Any] = {"mtime": None, "envs": None} + + +def _parse_bool(value: Any) -> bool: + if isinstance(value, bool): + return value + if isinstance(value, str): + return value.strip().lower() in ("true", "yes", "1", "on") + return bool(value) + + +def _parse_int(value: Any) -> int | None: + try: + return int(value) + except (TypeError, ValueError): + return None + + +def _parse_tags(value: Any) -> list[str]: + if value is None: + return [] + if isinstance(value, list): + return [str(v).strip() for v in value if str(v).strip()] + return [t.strip() for t in str(value).replace(",", " ").split() if t.strip()] + + +def _env_record(env_name: str, items: list[list[Any]]) -> dict[str, Any]: + """Build a normalized dict for one env section.""" + d = dict(items) + return { + "env": env_name, + "architecture": d.get("custom_meshtastic_architecture"), + "hw_model": _parse_int(d.get("custom_meshtastic_hw_model")), + "hw_model_slug": d.get("custom_meshtastic_hw_model_slug"), + "display_name": d.get("custom_meshtastic_display_name"), + "actively_supported": _parse_bool( + d.get("custom_meshtastic_actively_supported") + ), + "support_level": _parse_int(d.get("custom_meshtastic_support_level")), + "board_level": d.get("board_level"), # "pr", "extra", or None + "tags": _parse_tags(d.get("custom_meshtastic_tags")), + "images": _parse_tags(d.get("custom_meshtastic_images")), + "board": d.get("board"), + "upload_speed": _parse_int(d.get("upload_speed")), + "upload_protocol": d.get("upload_protocol"), + "monitor_speed": _parse_int(d.get("monitor_speed")), + "monitor_filters": d.get("monitor_filters") or [], + "_raw": d, # Full dict for get_board + } + + +def _load_all() -> dict[str, dict[str, Any]]: + """Parse `pio project config` into `{env_name: record}`.""" + raw = pio.run_json(["project", "config"], timeout=pio.TIMEOUT_PROJECT_CONFIG) + result: dict[str, dict[str, Any]] = {} + for section_name, items in raw: + if not isinstance(section_name, str) or not section_name.startswith("env:"): + continue + env_name = section_name.split(":", 1)[1] + result[env_name] = _env_record(env_name, items) + return result + + +def _get_cached() -> dict[str, dict[str, Any]]: + root = config.firmware_root() + platformio_ini = root / "platformio.ini" + try: + mtime = platformio_ini.stat().st_mtime + except FileNotFoundError: + mtime = None + + with _CACHE_LOCK: + if _CACHE["envs"] is not None and _CACHE["mtime"] == mtime: + return _CACHE["envs"] + envs = _load_all() + _CACHE["envs"] = envs + _CACHE["mtime"] = mtime + return envs + + +def invalidate_cache() -> None: + with _CACHE_LOCK: + _CACHE["envs"] = None + _CACHE["mtime"] = None + + +def _public_record(rec: dict[str, Any]) -> dict[str, Any]: + """Strip the `_raw` field for list outputs.""" + return {k: v for k, v in rec.items() if not k.startswith("_")} + + +def list_boards( + architecture: str | None = None, + actively_supported_only: bool = False, + query: str | None = None, + board_level: str | None = None, # "release" | "pr" | "extra" +) -> list[dict[str, Any]]: + """Enumerate PlatformIO envs with Meshtastic metadata. + + Filters are cumulative (AND). `board_level="release"` means envs with no + explicit `board_level` set (the default release targets). + """ + envs = _get_cached() + q = query.lower().strip() if query else None + + out = [] + for rec in envs.values(): + if architecture and rec.get("architecture") != architecture: + continue + if actively_supported_only and not rec.get("actively_supported"): + continue + if board_level is not None: + rec_level = rec.get("board_level") + if board_level == "release": + if rec_level not in (None, ""): + continue + elif rec_level != board_level: + continue + if q: + display = (rec.get("display_name") or "").lower() + env_name = rec.get("env", "").lower() + slug = (rec.get("hw_model_slug") or "").lower() + if q not in display and q not in env_name and q not in slug: + continue + out.append(_public_record(rec)) + + out.sort(key=lambda r: (r.get("architecture") or "", r.get("env"))) + return out + + +def get_board(env: str) -> dict[str, Any]: + """Full metadata for one env, including the raw pio config dict.""" + envs = _get_cached() + rec = envs.get(env) + if rec is None: + raise KeyError( + f"Unknown env: {env!r}. Use list_boards() to see available envs." + ) + public = _public_record(rec) + public["raw_config"] = rec["_raw"] + return public diff --git a/mcp-server/src/meshtastic_mcp/camera.py b/mcp-server/src/meshtastic_mcp/camera.py new file mode 100644 index 00000000000..5f1e5ede323 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/camera.py @@ -0,0 +1,286 @@ +"""Cross-platform USB-webcam capture for UI tests + the `capture_screen` tool. + +Backends: +- `opencv` — cv2.VideoCapture (AVFoundation on macOS, V4L2 on Linux). +- `ffmpeg` — subprocess shelling out to the system `ffmpeg` binary. Slower + per frame, but zero Python deps beyond stdlib. +- `null` — no-op stub returning a 1×1 black PNG. Used when no camera is + configured; keeps code paths alive without forcing every operator to + hook up hardware. + +Environment variables (read at `get_camera()` call time): +- `MESHTASTIC_UI_CAMERA_BACKEND` — one of `opencv` / `ffmpeg` / `null` / + `auto` (default). `auto` picks opencv if `cv2` imports, else ffmpeg if + `ffmpeg --version` resolves, else null. +- `MESHTASTIC_UI_CAMERA_DEVICE` — generic default (index or path). +- `MESHTASTIC_UI_CAMERA_DEVICE_` — per-role override, e.g. + `MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3=0` for the OLED-bearing heltec-v3. + Role suffix is uppercased before lookup. + +Dependencies land in the optional `[ui]` extra; imports are lazy so clients +without `opencv-python-headless` installed can still import this module. +""" + +from __future__ import annotations + +import io +import os +import shutil +import subprocess +import sys +import time +import warnings +from pathlib import Path +from typing import Protocol + + +class CameraError(RuntimeError): + """Raised when a camera backend fails to initialize or capture.""" + + +class CameraBackend(Protocol): + name: str + + def capture(self) -> bytes: + """Return one PNG-encoded frame.""" + ... + + def close(self) -> None: ... + + +# ---------- OpenCV backend ------------------------------------------------- + + +class OpenCVBackend: + name = "opencv" + + def __init__(self, device: int | str, warmup_frames: int = 5) -> None: + try: + import cv2 # type: ignore[import-untyped] # noqa: PLC0415 + except ImportError as exc: + raise CameraError( + "opencv backend requested but `cv2` is not installed. " + "Install the mcp-server [ui] extra: pip install -e '.[ui]'" + ) from exc + + self._cv2 = cv2 + device_arg: int | str + if isinstance(device, str) and device.isdigit(): + device_arg = int(device) + else: + device_arg = device + self._cap = cv2.VideoCapture(device_arg) + if not self._cap.isOpened(): + raise CameraError( + f"cv2.VideoCapture({device_arg!r}) failed to open. " + "On macOS check TCC Camera permission; on Linux check /dev/video* and v4l2 access." + ) + + # Drop the first few frames — auto-exposure + white-balance settle. + for _ in range(warmup_frames): + self._cap.read() + # Detect a stuck black-frame camera early rather than silently + # producing all-black captures. + ok, frame = self._cap.read() + if not ok or frame is None: + self._cap.release() + raise CameraError(f"camera {device_arg!r} opened but returned no frames") + + def capture(self) -> bytes: + cv2 = self._cv2 + ok, frame = self._cap.read() + if not ok or frame is None: + raise CameraError("cv2.VideoCapture.read() returned no frame") + success, buf = cv2.imencode(".png", frame) + if not success: + raise CameraError("cv2.imencode('.png', ...) failed") + return bytes(buf) + + def close(self) -> None: + try: + self._cap.release() + except Exception: # noqa: BLE001 + pass + + +# ---------- ffmpeg subprocess backend -------------------------------------- + + +class FfmpegBackend: + name = "ffmpeg" + + def __init__(self, device: int | str) -> None: + if shutil.which("ffmpeg") is None: + raise CameraError("ffmpeg backend requested but `ffmpeg` is not on PATH") + + self._device = str(device) + # Platform-specific -f flag: + # macOS → avfoundation (index like "0") + # Linux → v4l2 (device like "/dev/video0" or "0") + if sys.platform == "darwin": + self._input_format = "avfoundation" + self._input_spec = self._device # bare index for avfoundation + else: + self._input_format = "v4l2" + self._input_spec = ( + self._device + if self._device.startswith("/dev/") + else f"/dev/video{self._device}" + ) + + def capture(self) -> bytes: + cmd = [ + "ffmpeg", + "-hide_banner", + "-loglevel", + "error", + "-f", + self._input_format, + "-i", + self._input_spec, + "-frames:v", + "1", + "-f", + "image2pipe", + "-vcodec", + "png", + "-", + ] + try: + out = subprocess.run( + cmd, capture_output=True, check=True, timeout=15 # noqa: S603 + ) + except subprocess.CalledProcessError as exc: + raise CameraError( + f"ffmpeg capture failed (rc={exc.returncode}): {exc.stderr.decode(errors='replace')[:200]}" + ) from exc + except subprocess.TimeoutExpired as exc: + raise CameraError("ffmpeg capture timed out after 15s") from exc + return out.stdout + + def close(self) -> None: + pass # stateless — each capture spawns a new process + + +# ---------- Null backend --------------------------------------------------- + + +# A tiny valid 1×1 transparent PNG so callers always get a decodable image. +_BLACK_1X1_PNG = bytes.fromhex( + "89504e470d0a1a0a0000000d49484452000000010000000108060000001f15c489" + "0000000d49444154789c6300010000000500010d0a2db40000000049454e44ae426082" +) + + +class NullBackend: + name = "null" + + def capture(self) -> bytes: + return _BLACK_1X1_PNG + + def close(self) -> None: + pass + + +# ---------- Factory -------------------------------------------------------- + + +def _resolve_device(role: str | None) -> str | None: + if role: + specific = os.environ.get(f"MESHTASTIC_UI_CAMERA_DEVICE_{role.upper()}") + if specific: + return specific + return os.environ.get("MESHTASTIC_UI_CAMERA_DEVICE") + + +def get_camera(role: str | None = None) -> CameraBackend: + """Return a CameraBackend for the given device role (e.g. `"esp32s3"`). + + Falls back to `NullBackend` if no camera is configured or the selected + backend fails to init — tests should treat captures as best-effort + evidence, not a blocker. + """ + backend = os.environ.get("MESHTASTIC_UI_CAMERA_BACKEND", "auto").lower() + device = _resolve_device(role) + + if backend in ("null", "none") or device is None: + return NullBackend() + + if backend == "auto": + # Prefer opencv if importable; fall back to ffmpeg; else null. + try: + import cv2 # type: ignore[import-untyped] # noqa: F401,PLC0415 + + backend = "opencv" + except ImportError: + backend = "ffmpeg" if shutil.which("ffmpeg") else "null" + + if backend == "opencv": + try: + return OpenCVBackend(device) + except CameraError as exc: + warnings.warn( + f"camera backend {backend!r} failed to initialize for device " + f"{device!r}: {exc}; falling back to null backend", + RuntimeWarning, + stacklevel=2, + ) + return NullBackend() + if backend == "ffmpeg": + try: + return FfmpegBackend(device) + except CameraError as exc: + warnings.warn( + f"camera backend {backend!r} failed to initialize for device " + f"{device!r}: {exc}; falling back to null backend", + RuntimeWarning, + stacklevel=2, + ) + return NullBackend() + if backend == "null": + return NullBackend() + + raise CameraError(f"unknown MESHTASTIC_UI_CAMERA_BACKEND: {backend!r}") + + +def save_capture(png_bytes: bytes, path: Path) -> None: + path.parent.mkdir(parents=True, exist_ok=True) + path.write_bytes(png_bytes) + + +def capture_to_file(role: str | None, path: Path) -> dict[str, object]: + """One-shot: open camera, capture, write PNG, close. Returns metadata.""" + started = time.monotonic() + cam = get_camera(role) + try: + data = cam.capture() + finally: + cam.close() + save_capture(data, path) + return { + "backend": cam.name, + "path": str(path), + "bytes": len(data), + "elapsed_s": round(time.monotonic() - started, 3), + } + + +def _is_png(data: bytes) -> bool: + return data.startswith(b"\x89PNG\r\n\x1a\n") + + +# Exposed so callers can sanity-check a capture without a full PIL import. +__all__ = [ + "CameraBackend", + "CameraError", + "FfmpegBackend", + "NullBackend", + "OpenCVBackend", + "capture_to_file", + "get_camera", + "save_capture", +] + +# Keep `io` import used (pyflakes is picky) via a small guard used at import +# time to normalize stdin/stdout if a subclass ever needs it. +_ = io.BytesIO # noqa: SLF001 diff --git a/mcp-server/src/meshtastic_mcp/cli/__init__.py b/mcp-server/src/meshtastic_mcp/cli/__init__.py new file mode 100644 index 00000000000..04729b643e1 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/__init__.py @@ -0,0 +1,6 @@ +"""Command-line entry points that sit alongside the MCP server. + +Modules here are loaded on-demand by `[project.scripts]` entries in +`pyproject.toml`. They are NOT imported by `meshtastic_mcp.server` or the +admin/info tool surface — the MCP server stays pure stdio JSON-RPC. +""" diff --git a/mcp-server/src/meshtastic_mcp/cli/_flashlog.py b/mcp-server/src/meshtastic_mcp/cli/_flashlog.py new file mode 100644 index 00000000000..889183bb30e --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/_flashlog.py @@ -0,0 +1,73 @@ +"""Flash progress log tailer for ``meshtastic-mcp-test-tui``. + +``pio.py`` / ``hw_tools.py`` tee subprocess output (``pio run -t upload``, +``esptool erase_flash``, ``nrfutil dfu``, etc.) to ``tests/flash.log`` +line-by-line as it arrives — controlled by the ``MESHTASTIC_MCP_FLASH_LOG`` +env var that ``run-tests.sh`` sets. The TUI tails that file so the operator +sees live flash progress in the pytest pane instead of 3 minutes of silence +during ``test_00_bake``. + +Separate from ``_fwlog.py`` because that one parses JSONL, this one +streams plain text lines. Same daemon-thread + EOF-backoff structure. +""" + +from __future__ import annotations + +import pathlib +import threading +import time +from typing import Callable + + +class FlashLogTailer(threading.Thread): + """Tail a plain-text log file, publish each stripped line via ``post``. + + ``post`` is invoked with a single ``str`` for every new line. Lines are + stripped of trailing newlines; empty lines after stripping are dropped. + + The file may not exist yet when this thread starts — it's truncated by + ``run-tests.sh`` at session start, but if the tailer races the shell, + we tolerate FileNotFoundError for up to ``wait_s`` seconds. + """ + + def __init__( + self, + path: pathlib.Path, + post: Callable[[str], None], + stop: threading.Event, + *, + wait_s: float = 30.0, + ) -> None: + super().__init__(daemon=True, name="flashlog-tail") + self._path = path + self._post = post + self._stop = stop + self._wait_s = wait_s + + def run(self) -> None: + deadline = time.monotonic() + self._wait_s + while not self._path.is_file(): + if self._stop.is_set() or time.monotonic() > deadline: + return + time.sleep(0.1) + try: + fh = self._path.open("r", encoding="utf-8", errors="replace") + except OSError: + return + try: + while not self._stop.is_set(): + line = fh.readline() + if not line: + time.sleep(0.05) + continue + line = line.rstrip("\r\n") + if not line: + continue + try: + self._post(line) + except Exception: + # A post failure (e.g. closed app) is terminal for this + # thread but we still want to close the file handle. + return + finally: + fh.close() diff --git a/mcp-server/src/meshtastic_mcp/cli/_fwlog.py b/mcp-server/src/meshtastic_mcp/cli/_fwlog.py new file mode 100644 index 00000000000..7db20f81cc8 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/_fwlog.py @@ -0,0 +1,96 @@ +"""Firmware log tail worker for ``meshtastic-mcp-test-tui``. + +Complements v1's reportlog-tail worker. ``tests/conftest.py`` owns a +session-scoped autouse fixture (``_firmware_log_stream``) that mirrors +every ``meshtastic.log.line`` pubsub event to ``tests/fwlog.jsonl`` — +one JSON object per line: + + {"ts": 1729100000.123, "port": "/dev/cu.usbmodem1101", "line": "..."} + +The TUI tails that file from a worker thread; each new line becomes a +:class:`FirmwareLogLine` message posted to the App. Same pattern as the +reportlog tail worker — truncate on launch, tolerate missing file for +30 s, back off at EOF. + +Kept in its own module so the (large) ``test_tui.py`` stays focused on +the Textual App shell. +""" + +from __future__ import annotations + +import json +import pathlib +import threading +import time +from typing import Any, Callable + + +class FirmwareLogTailer(threading.Thread): + """Tail ``tests/fwlog.jsonl``, publish parsed records via ``post``. + + ``post`` is the App's ``post_message`` (or any callable that accepts a + single payload arg). We pass parsed dicts rather than constructing + Textual Message objects here — keeps this module free of the + textual dependency so it's unit-testable in a bare venv. + + Parameters + ---------- + path: + Path to ``tests/fwlog.jsonl``. The file may not exist yet at + startup — pytest only creates it once the session fixture runs. + post: + Callable invoked with a dict ``{"ts", "port", "line"}`` for every + new line parsed from the file. + stop: + An event the App sets to signal shutdown. + wait_s: + How long to poll for the file's creation before giving up. Default + 30 s; pytest collection on a cold cache can be slow. + + """ + + def __init__( + self, + path: pathlib.Path, + post: Callable[[dict[str, Any]], None], + stop: threading.Event, + *, + wait_s: float = 30.0, + ) -> None: + super().__init__(daemon=True, name="fwlog-tail") + self._path = path + self._post = post + self._stop = stop + self._wait_s = wait_s + + def run(self) -> None: + deadline = time.monotonic() + self._wait_s + while not self._path.is_file(): + if self._stop.is_set() or time.monotonic() > deadline: + return + time.sleep(0.1) + try: + fh = self._path.open("r", encoding="utf-8") + except OSError: + return + try: + while not self._stop.is_set(): + line = fh.readline() + if not line: + time.sleep(0.05) + continue + line = line.strip() + if not line: + continue + try: + record = json.loads(line) + except json.JSONDecodeError: + continue + # Defensive: require the three fields we rely on. + if not isinstance(record, dict): + continue + if "line" not in record: + continue + self._post(record) + finally: + fh.close() diff --git a/mcp-server/src/meshtastic_mcp/cli/_history.py b/mcp-server/src/meshtastic_mcp/cli/_history.py new file mode 100644 index 00000000000..639dcec5f55 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/_history.py @@ -0,0 +1,127 @@ +"""Cross-run history for ``meshtastic-mcp-test-tui``. + +Persists one JSON object per pytest run to +``mcp-server/tests/.history/runs.jsonl``. The TUI reads the last N +entries on launch to render a duration sparkline in the header — a +quick read on whether the suite is slowing down over time. + +Schema (keep small; the file can grow for months): + + {"run": 42, "ts": 1729100000.0, "duration_s": 387.2, + "passed": 52, "failed": 0, "skipped": 23, "exit_code": 0, + "seed": "mcp-user-host"} +""" + +from __future__ import annotations + +import json +import pathlib +import time +from dataclasses import asdict, dataclass +from typing import Iterable + +# Sparkline glyphs, low → high. 8 levels is the Unicode convention. +_SPARK_BLOCKS = "▁▂▃▄▅▆▇█" + + +@dataclass +class RunRecord: + run: int + ts: float + duration_s: float + passed: int + failed: int + skipped: int + exit_code: int + seed: str + + +class HistoryStore: + """Append-only JSONL store with bounded read. + + Writes are fsynced after each append (the file is tiny; fsync cost + is negligible and protects against truncation on a crash). + """ + + def __init__(self, path: pathlib.Path, *, keep_last: int = 50) -> None: + self._path = path + self._keep_last = keep_last + + def append(self, record: RunRecord) -> None: + try: + self._path.parent.mkdir(parents=True, exist_ok=True) + with self._path.open("a", encoding="utf-8") as fh: + fh.write(json.dumps(asdict(record)) + "\n") + fh.flush() + except Exception: + # Non-fatal: history is cosmetic. + pass + + def read_recent(self) -> list[RunRecord]: + """Return the last ``keep_last`` records in chronological order.""" + if not self._path.is_file(): + return [] + try: + lines = self._path.read_text(encoding="utf-8").splitlines() + except OSError: + return [] + out: list[RunRecord] = [] + # Parse tail-first so we don't waste work on a huge history. + for line in lines[-self._keep_last :]: + line = line.strip() + if not line: + continue + try: + raw = json.loads(line) + except json.JSONDecodeError: + continue + try: + out.append(RunRecord(**raw)) + except TypeError: + # Schema drift; skip the record rather than crash. + continue + return out + + def record_run( + self, + *, + run: int, + duration_s: float, + passed: int, + failed: int, + skipped: int, + exit_code: int, + seed: str, + ) -> RunRecord: + rec = RunRecord( + run=run, + ts=time.time(), + duration_s=float(duration_s), + passed=int(passed), + failed=int(failed), + skipped=int(skipped), + exit_code=int(exit_code), + seed=seed, + ) + self.append(rec) + return rec + + +def sparkline(values: Iterable[float], *, width: int = 20) -> str: + """Render a Unicode block-character sparkline from the last ``width`` values. + + Returns an empty string for empty input so the header handles + "no history yet" gracefully. + """ + buf = [v for v in values if v >= 0][-width:] + if not buf: + return "" + lo, hi = min(buf), max(buf) + if hi - lo < 1e-9: + return _SPARK_BLOCKS[len(_SPARK_BLOCKS) // 2] * len(buf) + n = len(_SPARK_BLOCKS) - 1 + out = [] + for v in buf: + idx = int(round((v - lo) / (hi - lo) * n)) + out.append(_SPARK_BLOCKS[max(0, min(n, idx))]) + return "".join(out) diff --git a/mcp-server/src/meshtastic_mcp/cli/_reproducer.py b/mcp-server/src/meshtastic_mcp/cli/_reproducer.py new file mode 100644 index 00000000000..420da3c76a7 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/_reproducer.py @@ -0,0 +1,214 @@ +"""Reproducer bundle builder for ``meshtastic-mcp-test-tui``. + +When the operator presses ``x`` on a failed test leaf, we package the +minimum viable failure context into a tarball under +``mcp-server/tests/reproducers/``: + +:: + + repro--.tar.gz + ├── README.md human-readable overview + ├── test_report.json the failing TestReport event from reportlog + ├── fwlog.jsonl firmware log filtered to the failure window + ├── devices.json per-device device_info + lora config snapshot + └── env.json seed, run #, pytest version, platform, hostname + +Separate module so the logic can be unit-tested without Textual. The +TUI glue is thin — one key binding calls :func:`build_reproducer_bundle` +with the focused test's state and shows the path in a modal. +""" + +from __future__ import annotations + +import io +import json +import pathlib +import platform +import re +import socket +import tarfile +import time +from dataclasses import dataclass +from typing import Any, Iterable + + +@dataclass +class ReproContext: + """Everything :func:`build_reproducer_bundle` needs. Shaped to map + cleanly onto the state the TUI already tracks — no extra data + collection required at export time.""" + + nodeid: str + longrepr: str + sections: list[tuple[str, str]] + start_ts: float | None + stop_ts: float | None + seed: str + run_number: int + exit_code: int | None + fwlog_path: pathlib.Path + output_dir: pathlib.Path + extra_device_rows: list[dict[str, Any]] # [{role, port, info, ...}, ...] + + +def _short_nodeid(nodeid: str) -> str: + """Collapse a pytest nodeid into a filename-safe slug (<= 60 chars).""" + # Drop the file path prefix; keep test name + parametrization. + tail = nodeid.split("::", 1)[-1] if "::" in nodeid else nodeid + slug = re.sub(r"[^A-Za-z0-9_.\-]", "_", tail) + return slug[:60].strip("_.-") or "test" + + +def _filtered_fwlog( + fwlog_path: pathlib.Path, + start_ts: float | None, + stop_ts: float | None, + *, + pad_s: float = 5.0, +) -> bytes: + """Return fwlog.jsonl lines whose ``ts`` lies in [start-pad, stop+pad].""" + if not fwlog_path.is_file(): + return b"" + if start_ts is None or stop_ts is None: + # Without a time window, include the whole file — rare; happens + # when a test fails in setup before pytest emitted a start ts. + try: + return fwlog_path.read_bytes() + except OSError: + return b"" + lo, hi = start_ts - pad_s, stop_ts + pad_s + out = io.BytesIO() + try: + with fwlog_path.open("r", encoding="utf-8") as fh: + for line in fh: + stripped = line.strip() + if not stripped: + continue + try: + record = json.loads(stripped) + except json.JSONDecodeError: + continue + ts = record.get("ts") + if not isinstance(ts, (int, float)): + continue + if lo <= ts <= hi: + out.write(line.encode("utf-8")) + except OSError: + return b"" + return out.getvalue() + + +def _readme(ctx: ReproContext) -> str: + t = time.strftime("%Y-%m-%d %H:%M:%S %Z", time.localtime()) + return f"""# Reproducer bundle + +Exported by `meshtastic-mcp-test-tui` on {t}. + +## Failing test + +- **nodeid:** `{ctx.nodeid}` +- **seed:** `{ctx.seed}` +- **run #:** {ctx.run_number} +- **suite exit code (at export time):** {ctx.exit_code if ctx.exit_code is not None else "in progress"} + +## Files in this archive + +| File | Contents | +|---|---| +| `test_report.json` | The pytest-reportlog `TestReport` event for the failing test — includes `longrepr`, captured `sections` (stdout/stderr/log), `duration`, `location`, `keywords`. | +| `fwlog.jsonl` | Firmware log lines (from `meshtastic.log.line` pubsub) filtered to [start−5s, stop+5s] around the test's run window. Each line is `{{ts, port, line}}`. | +| `devices.json` | Per-device snapshot at export time: `device_info` + `lora` config per detected role. | +| `env.json` | Python version, platform, hostname, seed, run number. | + +## How to triage + +1. Open `test_report.json` and read `longrepr` + `sections` — most failures explain themselves there. +2. If the failure is a mesh/telemetry assertion, `fwlog.jsonl` is where the answer usually lives. Grep for `Error=`, `NAK`, `PKI_UNKNOWN_PUBKEY`, `Skip send`, `Guru Meditation`, or the uptime timestamps around the assertion event. +3. Compare `devices.json` against the expected state (e.g. `num_nodes >= 2`, `primary_channel == "McpTest"`, `region == "US"`). If fields disagree with the seed-derived USERPREFS profile, the device probably wasn't baked with this session's profile. + +## Reproducing locally + +```bash +cd mcp-server +MESHTASTIC_MCP_SEED='{ctx.seed}' .venv/bin/pytest '{ctx.nodeid}' --tb=long -v +``` +""" + + +def build_reproducer_bundle(ctx: ReproContext) -> pathlib.Path: + """Build a tarball under ``ctx.output_dir`` and return its path. + + Parent dirs are created as needed. Errors during optional sections + (devices, env) are swallowed — the bundle is still useful without + them; refusing to export because the device poller had a hiccup + would be worse than the export missing a file. + """ + ctx.output_dir.mkdir(parents=True, exist_ok=True) + ts = int(time.time()) + slug = _short_nodeid(ctx.nodeid) + archive_path = ctx.output_dir / f"repro-{ts}-{slug}.tar.gz" + + with tarfile.open(archive_path, "w:gz") as tar: + + def _add(name: str, data: bytes) -> None: + info = tarfile.TarInfo(name=name) + info.size = len(data) + info.mtime = ts + tar.addfile(info, io.BytesIO(data)) + + # README + _add("README.md", _readme(ctx).encode("utf-8")) + + # test_report.json — reconstruct from the fields the TUI stashes. + test_report = { + "nodeid": ctx.nodeid, + "outcome": "failed", + "longrepr": ctx.longrepr, + "sections": [list(s) for s in ctx.sections], + "start": ctx.start_ts, + "stop": ctx.stop_ts, + } + _add( + "test_report.json", + json.dumps(test_report, indent=2, default=str).encode("utf-8"), + ) + + # fwlog.jsonl (filtered) + _add("fwlog.jsonl", _filtered_fwlog(ctx.fwlog_path, ctx.start_ts, ctx.stop_ts)) + + # devices.json + try: + devices_payload = json.dumps( + ctx.extra_device_rows or [], indent=2, default=str + ) + except Exception: + devices_payload = "[]" + _add("devices.json", devices_payload.encode("utf-8")) + + # env.json + try: + from importlib.metadata import version as _pkg_version + + pytest_version = _pkg_version("pytest") + except Exception: + pytest_version = "unknown" + env_payload = { + "seed": ctx.seed, + "run": ctx.run_number, + "exit_code": ctx.exit_code, + "export_ts": ts, + "python": platform.python_version(), + "pytest": pytest_version, + "platform": f"{platform.system()} {platform.release()} {platform.machine()}", + "hostname": socket.gethostname(), + } + _add("env.json", json.dumps(env_payload, indent=2).encode("utf-8")) + + return archive_path + + +def iter_entries(archive_path: pathlib.Path) -> Iterable[str]: + """Yield member names — used by callers that want to confirm the bundle shape.""" + with tarfile.open(archive_path, "r:gz") as tar: + for m in tar.getmembers(): + yield m.name diff --git a/mcp-server/src/meshtastic_mcp/cli/_uicap.py b/mcp-server/src/meshtastic_mcp/cli/_uicap.py new file mode 100644 index 00000000000..44845995480 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/_uicap.py @@ -0,0 +1,83 @@ +"""UI-capture transcript tailer for ``meshtastic-mcp-test-tui``. + +Watches ``tests/ui_captures//`` for new transcript lines +(one per ``frame_capture()`` call from the UI tier) and posts them to +the TUI. Enabled by ``MESHTASTIC_UI_TUI_CAMERA=1``. + +Design mirrors ``_flashlog.py``: +- Daemon thread, cooperative stop via ``threading.Event``. +- Tolerates the captures directory not existing yet (UI tier hasn't run). +- Per-file seek state so we only forward genuinely-new lines. +""" + +from __future__ import annotations + +import pathlib +import threading +import time +from typing import Callable + + +class UiCaptureTailer(threading.Thread): + """Recursively watch a captures root for new `transcript.md` lines. + + Invokes ``post(test_id, line)`` for each new line, where ``test_id`` + is derived from the path — the sanitized nodeid directory name. + """ + + def __init__( + self, + root: pathlib.Path, + post: Callable[[str, str], None], + stop: threading.Event, + *, + poll_interval: float = 0.5, + ) -> None: + super().__init__(daemon=True, name="uicap-tail") + self._root = root + self._post = post + self._stop = stop + self._poll_interval = poll_interval + # path → byte offset we've already read through + self._offsets: dict[pathlib.Path, int] = {} + + def run(self) -> None: + while not self._stop.is_set(): + try: + self._scan_once() + except Exception: + # Best-effort tailer — never bring down the TUI because a + # directory vanished mid-scan. + pass + time.sleep(self._poll_interval) + + def _scan_once(self) -> None: + if not self._root.is_dir(): + return + for transcript in self._root.rglob("transcript.md"): + test_id = transcript.parent.name + offset = self._offsets.get(transcript, 0) + try: + size = transcript.stat().st_size + except OSError: + continue + if size < offset: + # File truncated / rewritten — reset and re-emit. + offset = 0 + if size == offset: + continue + try: + with transcript.open("rb") as fh: + fh.seek(offset) + chunk = fh.read(size - offset).decode("utf-8", errors="replace") + except OSError: + continue + for line in chunk.splitlines(): + line = line.rstrip() + if not line or line.startswith("#"): + continue + try: + self._post(test_id, line) + except Exception: + return + self._offsets[transcript] = size diff --git a/mcp-server/src/meshtastic_mcp/cli/test_tui.py b/mcp-server/src/meshtastic_mcp/cli/test_tui.py new file mode 100644 index 00000000000..7f3a2da36e0 --- /dev/null +++ b/mcp-server/src/meshtastic_mcp/cli/test_tui.py @@ -0,0 +1,1911 @@ +"""Textual TUI wrapping `mcp-server/run-tests.sh`. + +Launch: ``meshtastic-mcp-test-tui [pytest-args]`` + +The TUI *wraps* ``run-tests.sh``; it never replaces it. Same script, same +env-var resolution, same ``userPrefs.jsonc`` session fixture. Four data +sources drive live state: + +1. ``tests/reportlog.jsonl`` — written by ``pytest-reportlog``. Tailed in a + worker thread; each JSON line is published as a :class:`ReportLogEvent` + message. This is the authoritative source for tree population + per-test + outcome. +2. The pytest subprocess ``stdout`` + ``stderr`` streams — line-by-line, + published as :class:`PytestLine` messages and rendered verbatim in the + pytest pane. +3. ``tests/fwlog.jsonl`` — firmware log stream. Written by the + ``_firmware_log_stream`` autouse session fixture in ``conftest.py`` + (mirrors every ``meshtastic.log.line`` pubsub event), tailed by the + :class:`FirmwareLogTailer` worker, displayed in a wrap-enabled + RichLog with cycleable port filter. +4. ``devices.list_devices()`` + ``info.device_info(port)`` — polled only at + startup and again after ``RunFinished``. Device polling while pytest + holds a SerialInterface would deadlock on the exclusive port lock; the + existing ``hub_devices`` fixture is session-scoped so there is no safe + "between tests" window. The header reflects this with a "(stale)" + marker while the run is active. + +Key bindings (see :class:`TestTuiApp.BINDINGS`): + ``r`` re-run focused ``f`` filter tree ``d`` failure detail + ``g`` open report.html ``l`` cycle firmware-log port filter + ``x`` export reproducer bundle ``c`` tool-coverage panel + ``q`` / Ctrl-C graceful quit with SIGINT → SIGTERM → SIGKILL escalation + +Shipped today (v1 + v2 slice): test tree + tier counters with progress bars, +pytest tail, live firmware log with port filter, device strip with +"currently running" status column, failure-detail modal, reproducer bundle +export (filters fwlog by test's start/stop timestamps), tool-coverage +modal, cross-run history sparkline in the header, clean SIGINT +propagation. Still open (see the plan file): mesh topology mini-diagram +and airtime / channel-utilization gauges. +""" + +from __future__ import annotations + +import argparse +import json +import os +import pathlib +import signal +import subprocess +import sys +import threading +import time +from dataclasses import dataclass, field +from typing import Any, Iterator + +# --------------------------------------------------------------------------- +# Configuration constants +# --------------------------------------------------------------------------- + +# Tier names that map nodeids like "tests//..." to counter buckets. +# Order here == display order in the tier-counters table. Matches the order +# `pytest_collection_modifyitems` in `conftest.py` uses: +# bake → unit → mesh → telemetry → monitor → fleet → admin → provisioning +# so the counters table reads top-to-bottom in execution order. +# +# "bake" is the synthetic tier for `tests/test_00_bake.py` — the file sits +# at the `tests/` root rather than under a tier subdirectory, so without +# this mapping `_tier_of_nodeid` would return "other" and the bake outcomes +# would be silently dropped from both the tier table and the history +# record (which sums tier counters to compute passed/failed/skipped). +TIERS = ( + "bake", + "unit", + "mesh", + "telemetry", + "monitor", + "fleet", + "admin", + "provisioning", +) + +# Relative paths from the mcp-server root. +_REPORTLOG_RELATIVE = "tests/reportlog.jsonl" +_FWLOG_RELATIVE = "tests/fwlog.jsonl" +# pio / esptool / nrfutil / picotool tee subprocess output here when +# `MESHTASTIC_MCP_FLASH_LOG` is set (see `pio._run_capturing`). run-tests.sh +# sets that env var; the TUI also sets it for direct `_spawn_pytest` calls +# so `r`-key re-runs that skip the wrapper still get tee'd output. +_FLASHLOG_RELATIVE = "tests/flash.log" +_REPORT_HTML_RELATIVE = "tests/report.html" +_TOOL_COVERAGE_RELATIVE = "tests/tool_coverage.json" +_HISTORY_RELATIVE = "tests/.history/runs.jsonl" +_REPRODUCERS_RELATIVE = "tests/reproducers" +_RUN_TESTS_RELATIVE = "run-tests.sh" +_RUN_COUNTER_RELATIVE = "tests/.tui-runs" + +# Graceful-shutdown budgets (seconds) for the pytest subprocess when the +# user hits `q`. Matches what the existing CLI's atexit + userprefs sidecar +# self-heal expects. +_SIGINT_GRACE_S = 5.0 +_SIGTERM_GRACE_S = 5.0 + + +# --------------------------------------------------------------------------- +# Path resolution +# --------------------------------------------------------------------------- + + +def _mcp_server_root() -> pathlib.Path: + """Locate the mcp-server directory (the one containing run-tests.sh).""" + here = pathlib.Path(__file__).resolve() + # Walk up until we find pyproject.toml with a matching project name, or + # default to the three-up ancestor (src/meshtastic_mcp/cli/test_tui.py → + # .../mcp-server). The walk-up protects against unusual checkouts. + for parent in (here.parent, *here.parents): + if (parent / "pyproject.toml").is_file() and ( + parent / "run-tests.sh" + ).is_file(): + return parent + return here.parents[3] + + +# --------------------------------------------------------------------------- +# Data classes +# --------------------------------------------------------------------------- + + +@dataclass +class LeafReport: + """Per-test state drawn from reportlog events. + + Outcomes mirror pytest's: "passed" | "failed" | "skipped" | "running". + """ + + nodeid: str + tier: str + outcome: str = "pending" + duration_s: float = 0.0 + longrepr: str = "" + # Captured stdout / stderr / firmware-log sections from the test's + # `TestReport.sections` — shown in the failure-detail modal. + sections: list[tuple[str, str]] = field(default_factory=list) + # Wall-clock start/stop from the TestReport event. Used by the + # reproducer exporter (`x`) to filter `tests/fwlog.jsonl` down to + # just the lines around the failure window. + start_ts: float | None = None + stop_ts: float | None = None + + +@dataclass +class TierCounters: + tier: str + passed: int = 0 + failed: int = 0 + skipped: int = 0 + running: int = 0 + remaining: int = 0 + + +@dataclass +class DeviceRow: + role: str | None + port: str + vid: str + pid: str + description: str + # Populated from info.device_info when available; empty dict when we + # haven't queried (or when the poller is paused). + info: dict[str, Any] = field(default_factory=dict) + + +@dataclass +class State: + """Shared state owned by the App; written by workers under `lock`. + + UI code reads via Textual Message handlers which run on the UI thread + in the order workers called `post_message` — so reads don't need the + lock themselves. + """ + + lock: threading.Lock = field(default_factory=threading.Lock) + tiers: dict[str, TierCounters] = field( + default_factory=lambda: {t: TierCounters(tier=t) for t in TIERS} + ) + leaves: dict[str, LeafReport] = field(default_factory=dict) + # Ordered list of nodeids in the order they were first seen — lets us + # rebuild the tree deterministically. + nodeid_order: list[str] = field(default_factory=list) + devices: list[DeviceRow] = field(default_factory=list) + run_active: bool = False + exit_code: int | None = None + # nodeid of the currently-running test. Set on `when="setup"` + + # outcome="passed" (body about to execute); cleared on `when="call"` + # (any outcome) or on `when="setup"` + outcome="failed" (no body + # window). Drives the device-table "Status" column so the operator + # can see which test is touching a given device right now. + running_nodeid: str | None = None + # `time.monotonic()` captured when `running_nodeid` was set. Surfaced + # as live-updating elapsed-time ("RUNNING: test_bake_nrf52 (1:23)") so + # an operator staring at a ~3 min `test_00_bake` or a `mesh_formation` + # with a 60 s ceiling has concrete evidence the test isn't stuck. + running_started_at: float | None = None + + +# --------------------------------------------------------------------------- +# Helpers +# --------------------------------------------------------------------------- + + +def _tier_of_nodeid(nodeid: str) -> str: + """Map a pytest nodeid to its tier bucket. Unknown → 'other'. + + `tests/test_00_bake.py::...` is special-cased to the synthetic `bake` + tier — it's a top-level file (no tier subdirectory) so the generic + "second path segment" logic would miss it and route the bake outcomes + into the non-existent `other` bucket. + """ + parts = nodeid.split("/", 2) + if len(parts) >= 2 and parts[0] == "tests": + # Bake file sits at `tests/test_00_bake.py` — dedicated bucket. + if parts[1].startswith("test_00_bake"): + return "bake" + candidate = parts[1] + if candidate in TIERS: + return candidate + return "other" + + +def _file_of_nodeid(nodeid: str) -> str: + """Extract the test file name (e.g. 'test_boards.py') from a nodeid.""" + left = nodeid.split("::", 1)[0] + return left.rsplit("/", 1)[-1] + + +def _testname_of_nodeid(nodeid: str) -> str: + """Extract the 'test_foo[param]' suffix from a nodeid, or the full thing.""" + if "::" in nodeid: + return nodeid.split("::", 1)[1] + return nodeid + + +def _roles_from_nodeid(nodeid: str) -> set[str]: + """Infer which device roles a parametrized test touches. + + Patterns we recognize (from the existing ``conftest.py`` parametrization + in ``pytest_generate_tests``): + + - ``test_foo[nrf52]`` → {"nrf52"} (baked_single) + - ``test_foo[nrf52->esp32s3]`` → {"nrf52", "esp32s3"} (mesh_pair) + + Unparametrized tests (no bracket) return an empty set — the caller + should fall back to "this test involves ALL detected devices" rather + than pretending it touches none. + """ + if "[" not in nodeid or not nodeid.endswith("]"): + return set() + try: + inner = nodeid.rsplit("[", 1)[1][:-1] + except Exception: + return set() + # Split on "->" for directed mesh pairs; otherwise treat as single role. + parts = [p.strip() for p in inner.split("->")] if "->" in inner else [inner.strip()] + return {p for p in parts if p} + + +def _parse_events(path: pathlib.Path) -> Iterator[dict[str, Any]]: + """Yield parsed JSON dicts from a reportlog file, skipping malformed lines. + + Used for smoke-testing the parser against a finished file; the live + worker has its own tail loop. + """ + if not path.is_file(): + return + with path.open("r", encoding="utf-8") as fh: + for line in fh: + line = line.strip() + if not line: + continue + try: + yield json.loads(line) + except json.JSONDecodeError: + continue + + +def _load_run_number(counter_path: pathlib.Path) -> int: + """Bump + persist a monotonic run counter used in the TUI header.""" + try: + n = int(counter_path.read_text().strip()) + except Exception: + n = 0 + n += 1 + try: + counter_path.parent.mkdir(parents=True, exist_ok=True) + counter_path.write_text(str(n)) + except Exception: + # Non-fatal: the counter is cosmetic. + pass + return n + + +def _resolve_seed() -> str: + """Mirror the default-seed resolution from run-tests.sh. + + Operator can override via MESHTASTIC_MCP_SEED. Matches the + per-user/per-host default so repeated invocations land on the same PSK + (makes --assume-baked valid across invocations). + """ + if explicit := os.environ.get("MESHTASTIC_MCP_SEED"): + return explicit + try: + who = os.environ.get("USER") or os.environ.get("LOGNAME") or "anon" + except Exception: + who = "anon" + try: + import socket + + host = socket.gethostname().split(".", 1)[0] + except Exception: + host = "host" + return f"mcp-{who}-{host}" + + +def _format_duration(seconds: float) -> str: + if seconds < 60: + return f"{seconds:5.1f}s" + m, s = divmod(int(seconds), 60) + return f"{m:d}:{s:02d}" + + +# --------------------------------------------------------------------------- +# Textual imports (lazy — only when main() runs, so `_parse_events` can be +# imported by smoke tests without requiring textual installed in every env) +# --------------------------------------------------------------------------- + + +def _import_textual() -> Any: + """Return a namespace carrying every Textual class we use. + + Deferred import keeps `_parse_events` + `_tier_of_nodeid` importable + from tests / smoke scripts without pulling in the UI stack. + """ + import textual + from textual.app import App, ComposeResult + from textual.binding import Binding + from textual.containers import Horizontal, Vertical + from textual.message import Message + from textual.screen import ModalScreen + from textual.widgets import DataTable, Footer, Input, RichLog, Static, Tree + + ns = argparse.Namespace() + ns.App = App + ns.Binding = Binding + ns.ComposeResult = ComposeResult + ns.DataTable = DataTable + ns.Footer = Footer + ns.Horizontal = Horizontal + ns.Input = Input + ns.Message = Message + ns.ModalScreen = ModalScreen + ns.RichLog = RichLog + ns.Static = Static + ns.Tree = Tree + ns.Vertical = Vertical + ns.textual = textual + return ns + + +# --------------------------------------------------------------------------- +# main() — the important scaffolding lives here so that when we bail out +# before entering the Textual event loop (missing terminal, --help, etc.) +# nothing has grabbed the screen yet. +# --------------------------------------------------------------------------- + + +def main(argv: list[str] | None = None) -> int: + """Entry point for `meshtastic-mcp-test-tui`.""" + argv = list(argv if argv is not None else sys.argv[1:]) + + parser = argparse.ArgumentParser( + prog="meshtastic-mcp-test-tui", + description=( + "Live Textual TUI wrapping mcp-server/run-tests.sh. " + "Passes any unrecognized arguments through to pytest." + ), + allow_abbrev=False, + ) + parser.add_argument( + "--no-tui", + action="store_true", + help=( + "Skip the TUI and exec run-tests.sh directly. Useful as a health " + "check that the wrapper argv+env resolution is working." + ), + ) + args, pytest_args = parser.parse_known_args(argv) + + root = _mcp_server_root() + run_tests = root / _RUN_TESTS_RELATIVE + reportlog = root / _REPORTLOG_RELATIVE + fwlog = root / _FWLOG_RELATIVE + flashlog = root / _FLASHLOG_RELATIVE + counter = root / _RUN_COUNTER_RELATIVE + + if not run_tests.is_file(): + print( + f"error: could not locate {_RUN_TESTS_RELATIVE} relative to " + f"{root}. Is this the mcp-server checkout?", + file=sys.stderr, + ) + return 2 + + # Always clear stale log files before launching pytest. The TUI's tail + # workers race pytest file-creation; starting from a known-empty state + # avoids mid-line-decode confusion from the prior run. The fwlog session + # fixture also truncates on its end, and run-tests.sh truncates the + # flashlog — triple-truncate is deliberate (whichever side creates the + # file first, it starts empty). + for p in (reportlog, fwlog, flashlog): + try: + p.unlink(missing_ok=True) + except Exception: + pass + + # Compute + persist the run counter for the header (cosmetic). + run_number = _load_run_number(counter) + seed = _resolve_seed() + # Export the seed so the subprocess inherits the SAME value the TUI + # displays. run-tests.sh computes its own fallback if unset, and we'd + # end up with a header / wrapper-header mismatch if we let that happen. + os.environ.setdefault("MESHTASTIC_MCP_SEED", seed) + # Turn on subprocess-output tee'ing so `pio._run_capturing` writes each + # line of pio / esptool / nrfutil / picotool output to `tests/flash.log` + # as it arrives. The TUI tails that file and routes each line to the + # pytest pane so the operator sees live flash progress during long + # `pio run -t upload` / `esptool erase_flash` operations. run-tests.sh + # also sets this when invoked directly — `setdefault` so the wrapper's + # value wins when present. + os.environ.setdefault("MESHTASTIC_MCP_FLASH_LOG", str(flashlog)) + + # --no-tui: exec run-tests.sh directly. Useful for diagnosing wrapper + # env / argv handling without getting into Textual's alternate screen. + if args.no_tui: + cmd = [str(run_tests), *pytest_args] + os.execv(str(run_tests), cmd) # noqa: S606 — intentional + + # Textual UI import is deferred so `--help` and `--no-tui` do not pay + # the ~40 MB startup cost. + try: + tx = _import_textual() + except ImportError as exc: + print( + f"error: textual is not installed ({exc}). Install with: " + f"pip install -e '.[test]'", + file=sys.stderr, + ) + return 2 + + # Narrow-terminal warning (see plan §8 risk 2). Textual itself degrades, + # but a heads-up helps a first-time user. + term = os.environ.get("TERM", "") + if term in ("", "dumb", "screen") and not os.environ.get("TEXTUAL_NO_TERM_HINT"): + print( + f"[hint] TERM={term!r} may render poorly. Try " + f"`TERM=xterm-256color meshtastic-mcp-test-tui ...` if the layout " + f"looks broken.", + file=sys.stderr, + ) + + app = _build_app( + tx=tx, + root=root, + run_tests=run_tests, + reportlog=reportlog, + fwlog=fwlog, + flashlog=flashlog, + seed=seed, + run_number=run_number, + pytest_args=pytest_args, + ) + + # App.run() returns the subprocess exit code via `app.exit(returncode)`. + return_value = app.run() + if isinstance(return_value, int): + return return_value + return 0 + + +# --------------------------------------------------------------------------- +# Everything below is only reachable once Textual is importable. `tx` is +# the namespace returned by `_import_textual()` so we don't scatter `from +# textual import ...` across the file. +# --------------------------------------------------------------------------- + + +def _build_app( + *, + tx: Any, + root: pathlib.Path, + run_tests: pathlib.Path, + reportlog: pathlib.Path, + fwlog: pathlib.Path, + flashlog: pathlib.Path, + seed: str, + run_number: int, + pytest_args: list[str], +) -> Any: + """Assemble TestTuiApp with its Textual-dependent inner classes. + + Keeping the class definitions inside a factory means `main()` can + short-circuit (--no-tui, terminal-check, argparse error) before we + force Textual's import cost. + """ + + # Helper modules — lazy-imported here so the top-of-file import cost + # only kicks in when main() has decided to run the TUI. + from . import _flashlog as _flashlog_mod + from . import _fwlog as _fwlog_mod + from . import _history as _history_mod + from . import _reproducer as _reproducer_mod + from . import _uicap as _uicap_mod + + # ---------------- Messages ---------------- + + class ReportLogEvent(tx.Message): + def __init__(self, event: dict[str, Any]) -> None: + self.event = event + super().__init__() + + class PytestLine(tx.Message): + def __init__(self, source: str, line: str) -> None: + self.source = source # "stdout" | "stderr" + self.line = line + super().__init__() + + class FirmwareLogLine(tx.Message): + def __init__(self, record: dict[str, Any]) -> None: + # {"ts": float, "port": str | None, "line": str} + self.record = record + super().__init__() + + class FlashLogLine(tx.Message): + """Plain-text line from `tests/flash.log` — pio / esptool / nrfutil / + picotool output tee'd by `pio._run_capturing`. Routed to the pytest + pane so the operator sees live flash progress during `test_00_bake` + instead of 3 minutes of pytest-captured silence.""" + + def __init__(self, line: str) -> None: + self.line = line + super().__init__() + + class UiCaptureLine(tx.Message): + """Live line from the UI-tier camera transcript — one per + `frame_capture()` call. Posted only when the camera panel is + enabled via `MESHTASTIC_UI_TUI_CAMERA=1`.""" + + def __init__(self, test_id: str, line: str) -> None: + self.test_id = test_id + self.line = line + super().__init__() + + class DeviceSnapshot(tx.Message): + def __init__(self, rows: list[DeviceRow]) -> None: + self.rows = rows + super().__init__() + + class RunFinished(tx.Message): + def __init__(self, returncode: int) -> None: + self.returncode = returncode + super().__init__() + + # ---------------- Workers ---------------- + + class ReportlogWorker(threading.Thread): + """Tail `reportlog.jsonl`, publish each event.""" + + def __init__(self, app: Any, path: pathlib.Path, stop: threading.Event) -> None: + super().__init__(daemon=True, name="reportlog-tail") + self._app = app + self._path = path + self._stop = stop + + def run(self) -> None: + # Wait up to 30 s for pytest to create the file (first call on + # a cold cache can be slow). + wait_deadline = time.monotonic() + 30.0 + while not self._path.is_file(): + if self._stop.is_set() or time.monotonic() > wait_deadline: + return + time.sleep(0.1) + try: + fh = self._path.open("r", encoding="utf-8") + except OSError: + return + try: + while not self._stop.is_set(): + line = fh.readline() + if not line: + time.sleep(0.05) + continue + line = line.strip() + if not line: + continue + try: + event = json.loads(line) + except json.JSONDecodeError: + continue + self._app.post_message(ReportLogEvent(event)) + finally: + fh.close() + + class SubprocessReaderWorker(threading.Thread): + """Read one stream line-by-line and publish PytestLine messages.""" + + def __init__( + self, + app: Any, + stream: Any, + source: str, + stop: threading.Event, + ) -> None: + super().__init__(daemon=True, name=f"subprocess-{source}") + self._app = app + self._stream = stream + self._source = source + self._stop = stop + + def run(self) -> None: + try: + for line in iter(self._stream.readline, ""): + if self._stop.is_set(): + break + self._app.post_message( + PytestLine(source=self._source, line=line.rstrip("\n")) + ) + except Exception: + # stream closed / subprocess died; not fatal. + pass + + class DevicePollerWorker(threading.Thread): + """Poll list_devices() + device_info() at startup and after RunFinished. + + Deliberately NOT polling during the run — `hub_devices` is a + session-scoped fixture holding SerialInterfaces across the whole + session, and device_info() would deadlock on the exclusive port + lock. Header shows "(stale)" during the gap. + """ + + def __init__(self, app: Any, state: State, stop: threading.Event) -> None: + super().__init__(daemon=True, name="device-poller") + self._app = app + self._state = state + self._stop = stop + self._trigger = threading.Event() + + def trigger(self) -> None: + self._trigger.set() + + def run(self) -> None: + # Perform one poll at startup; then wait for explicit triggers. + self._poll_once() + while not self._stop.is_set(): + if self._trigger.wait(timeout=0.5): + self._trigger.clear() + if self._stop.is_set(): + break + with self._state.lock: + active = self._state.run_active + if active: + continue + self._poll_once() + + def _poll_once(self) -> None: + try: + from meshtastic_mcp import devices as devices_mod + from meshtastic_mcp import info as info_mod + except Exception as exc: # pragma: no cover + self._app.post_message( + PytestLine( + source="stderr", line=f"[tui] device import failed: {exc!r}" + ) + ) + return + rows: list[DeviceRow] = [] + try: + raw = devices_mod.list_devices(include_unknown=True) + except Exception as exc: + self._app.post_message( + PytestLine( + source="stderr", line=f"[tui] list_devices failed: {exc!r}" + ) + ) + return + for d in raw: + vid_raw = d.get("vid") or "" + try: + vid_i = ( + int(vid_raw, 16) + if isinstance(vid_raw, str) and vid_raw.startswith("0x") + else int(vid_raw) + ) + except (TypeError, ValueError): + vid_i = 0 + role = None + if vid_i == 0x239A: + role = "nrf52" + elif vid_i in (0x303A, 0x10C4): + role = "esp32s3" + if not role and not d.get("likely_meshtastic"): + continue + row = DeviceRow( + role=role, + port=d.get("port", ""), + vid=str(vid_raw), + pid=str(d.get("pid") or ""), + description=d.get("description", "") or "", + ) + if role: + try: + row.info = info_mod.device_info(port=row.port, timeout_s=6.0) + except Exception as exc: + row.info = {"error": repr(exc)} + rows.append(row) + self._app.post_message(DeviceSnapshot(rows=rows)) + + # ---------------- Modals ---------------- + + class FailureDetailScreen(tx.ModalScreen): + """Show a failed test's longrepr + captured sections.""" + + BINDINGS = [tx.Binding("escape,q", "dismiss", "close")] + + def __init__(self, leaf: LeafReport, report_html: pathlib.Path) -> None: + self._leaf = leaf + self._report_html = report_html + super().__init__() + + def compose(self) -> Any: + yield tx.Static( + f"[bold]{self._leaf.nodeid}[/bold] " + f"outcome=[red]{self._leaf.outcome}[/red] " + f"duration={_format_duration(self._leaf.duration_s)}", + id="failure-detail-header", + ) + log = tx.RichLog( + highlight=False, markup=False, wrap=False, id="failure-detail-log" + ) + yield log + yield tx.Static( + f"[dim]Full HTML report: {self._report_html}[/dim] [esc] close", + id="failure-detail-footer", + ) + + def on_mount(self) -> None: + log = self.query_one("#failure-detail-log", tx.RichLog) + if self._leaf.longrepr: + log.write(self._leaf.longrepr) + log.write("") + for section_name, section_text in self._leaf.sections: + log.write(f"--- {section_name} ---") + log.write(section_text) + log.write("") + if not self._leaf.longrepr and not self._leaf.sections: + log.write("(no longrepr or captured sections in reportlog event)") + + def action_dismiss(self, _result: Any = None) -> None: + self.dismiss() + + class FilterInputScreen(tx.ModalScreen[str]): + """Prompt the user for a tree filter substring (empty clears).""" + + BINDINGS = [tx.Binding("escape", "cancel", "cancel")] + + def compose(self) -> Any: + yield tx.Static("filter test tree (substring, empty = clear):") + yield tx.Input(placeholder="nodeid substring", id="filter-input") + + def on_input_submitted(self, event: Any) -> None: + self.dismiss(event.value.strip()) + + def action_cancel(self) -> None: + self.dismiss(None) + + class CoverageModal(tx.ModalScreen): + """Read `tests/tool_coverage.json` (written by `tests/tool_coverage.py` + at `pytest_sessionfinish`) and render a two-column summary of which + MCP tools got exercised by the run. `(no coverage data yet)` while + the run is in flight.""" + + BINDINGS = [tx.Binding("escape,q,c", "dismiss", "close")] + + def __init__(self, coverage_path: pathlib.Path) -> None: + self._path = coverage_path + super().__init__() + + def compose(self) -> Any: + yield tx.Static("[bold]MCP tool coverage[/bold]", id="coverage-header") + yield tx.RichLog( + highlight=False, markup=True, wrap=False, id="coverage-log" + ) + yield tx.Static( + f"[dim]{self._path}[/dim] [esc] close", + id="coverage-footer", + ) + + def on_mount(self) -> None: + log = self.query_one("#coverage-log", tx.RichLog) + if not self._path.is_file(): + log.write("(no coverage data — tool_coverage.json not written yet)") + log.write("") + log.write("Coverage is emitted at pytest_sessionfinish; this") + log.write("file appears after the suite completes.") + return + try: + data = json.loads(self._path.read_text(encoding="utf-8")) + except Exception as exc: + log.write(f"[red]failed to read {self._path}:[/red] {exc!r}") + return + calls = data.get("calls") or {} + if not calls: + log.write("(tool_coverage.json present but no calls recorded)") + return + exercised = sorted( + ((n, c) for n, c in calls.items() if c > 0), key=lambda x: -x[1] + ) + unexercised = sorted(n for n, c in calls.items() if c == 0) + log.write(f"[b]{len(exercised)} / {len(calls)} MCP tools exercised[/b]") + log.write("") + log.write("[green]exercised[/green] (count):") + for name, count in exercised: + log.write(f" {count:>4} {name}") + if unexercised: + log.write("") + log.write("[dim]not exercised:[/dim]") + for name in unexercised: + log.write(f" {name}") + + def action_dismiss(self, _result: Any = None) -> None: + self.dismiss() + + class ReproducerResultModal(tx.ModalScreen): + """Show the exported reproducer tarball path with a short instruction.""" + + BINDINGS = [tx.Binding("escape,q,enter", "dismiss", "close")] + + def __init__( + self, archive_path: pathlib.Path, error: str | None = None + ) -> None: + self._archive = archive_path + self._error = error + super().__init__() + + def compose(self) -> Any: + if self._error: + yield tx.Static(f"[red]Reproducer export failed:[/red] {self._error}") + else: + yield tx.Static("[bold green]Reproducer bundle written[/bold green]") + yield tx.Static(f"[cyan]{self._archive}[/cyan]") + yield tx.Static("") + yield tx.Static( + "Contains: README.md, test_report.json, fwlog.jsonl (time-filtered)," + ) + yield tx.Static( + "devices.json, env.json. Attach to an issue / paste the path in chat." + ) + yield tx.Static("") + yield tx.Static("[dim][esc] close[/dim]") + + def action_dismiss(self, _result: Any = None) -> None: + self.dismiss() + + # ---------------- App ---------------- + + class TestTuiApp(tx.App): + CSS = """ + Screen { layout: vertical; } + #header-bar { height: 2; padding: 0 1; background: $panel; } + #tier-table { height: auto; max-height: 11; } + #body { height: 1fr; } + #tree-pane { width: 50%; border-right: solid $primary-background; } + #right-pane { width: 50%; layout: vertical; } + #pytest-pane { height: 50%; border-bottom: solid $primary-background; } + #fwlog-header { height: 1; padding: 0 1; background: $panel; } + #fwlog-pane { height: 1fr; } + #uicap-header { height: 1; padding: 0 1; background: $boost; } + #uicap-pane { height: 14; border-top: solid $primary-background; } + #uicap-image { width: 36; border-right: solid $primary-background; padding: 0 1; } + #uicap-log { width: 1fr; height: 14; } + Tree { height: 100%; } + RichLog { height: 100%; } + #device-table { height: auto; max-height: 6; } + """ + + TITLE = "mcp-server test runner" + + BINDINGS = [ + tx.Binding("r", "rerun_focused", "re-run focused"), + tx.Binding("f", "filter_tree", "filter"), + tx.Binding("d", "failure_detail", "failure detail"), + tx.Binding("g", "open_html_report", "open report.html"), + tx.Binding("x", "export_reproducer", "export reproducer"), + tx.Binding("c", "coverage_panel", "coverage"), + tx.Binding("l", "cycle_fwlog_filter", "fw log filter"), + tx.Binding("q,ctrl+c", "quit_app", "quit"), + ] + + def __init__(self) -> None: + super().__init__() + self._state = State() + self._root = root + self._run_tests = run_tests + self._reportlog = reportlog + self._fwlog = fwlog + self._flashlog = flashlog + self._report_html = root / _REPORT_HTML_RELATIVE + self._tool_coverage = root / _TOOL_COVERAGE_RELATIVE + self._repro_dir = root / _REPRODUCERS_RELATIVE + self._seed = seed + self._run_number = run_number + self._pytest_args = pytest_args + self._start_time = time.monotonic() + self._proc: subprocess.Popen[str] | None = None + self._stop = threading.Event() + self._reportlog_worker: ReportlogWorker | None = None + self._stdout_worker: SubprocessReaderWorker | None = None + self._stderr_worker: SubprocessReaderWorker | None = None + self._device_worker: DevicePollerWorker | None = None + self._fwlog_worker: _fwlog_mod.FirmwareLogTailer | None = None + self._flashlog_worker: _flashlog_mod.FlashLogTailer | None = None + self._uicap_worker: _uicap_mod.UiCaptureTailer | None = None + # Env-gated; only mounts the UI-capture panel when operator asks for it. + self._ui_camera_enabled = bool( + int(os.environ.get("MESHTASTIC_UI_TUI_CAMERA", "0") or "0") + ) + self._tree_filter: str = "" + self._sigint_count = 0 + # Firmware-log port filter: None = all, else exact port match. + self._fwlog_filter: str | None = None + # Ordered set of distinct ports we've seen firmware log lines + # from — the `l` key cycles through these. + self._fwlog_ports: list[str] = [] + # Cross-run history. + self._history_store = _history_mod.HistoryStore( + root / _HISTORY_RELATIVE, keep_last=40 + ) + self._history_cache = self._history_store.read_recent() + + # -------- composition / mount -------- + + def compose(self) -> Any: + yield tx.Static(self._header_text(), id="header-bar") + tier_table = tx.DataTable(id="tier-table", show_cursor=False) + yield tier_table + with tx.Horizontal(id="body"): + with tx.Vertical(id="tree-pane"): + yield tx.Tree("tests", id="test-tree") + with tx.Vertical(id="right-pane"): + with tx.Vertical(id="pytest-pane"): + yield tx.RichLog( + id="pytest-log", + highlight=False, + markup=False, + wrap=False, + max_lines=5000, + ) + yield tx.Static(self._fwlog_header_text(), id="fwlog-header") + with tx.Vertical(id="fwlog-pane"): + yield tx.RichLog( + id="fwlog-log", + highlight=False, + markup=False, + # `wrap=True` so long firmware log lines (some + # hit ~200 chars — full packet hex dumps plus + # source tags) don't get truncated at the + # right edge. The right pane is ~50% of the + # terminal so even a wide terminal has a + # ~90-char cap; plain truncation dropped the + # uptime counter or packet id off the end. + wrap=True, + max_lines=5000, + ) + if self._ui_camera_enabled: + yield tx.Static( + "UI camera — latest capture + transcript (MESHTASTIC_UI_TUI_CAMERA=1)", + id="uicap-header", + ) + with tx.Horizontal(id="uicap-pane"): + yield tx.Static( + "(waiting…)", id="uicap-image", markup=False + ) + yield tx.RichLog( + id="uicap-log", + highlight=False, + markup=False, + wrap=True, + max_lines=500, + ) + yield tx.DataTable(id="device-table", show_cursor=False) + yield tx.Footer() + + def _fwlog_header_text(self) -> str: + filt = self._fwlog_filter or "(all ports)" + return f"firmware log filter: [b]{filt}[/b] [l] cycle" + + def on_mount(self) -> None: + # Tier-counters table. `add_column` (singular) lets us pick + # the key explicitly — `add_columns` (plural) in textual 8.x + # returns auto-generated keys that are tedious to track + # separately, and update_cell(column_key=