Spotify-Control & mplayer Robustness — live-reload, decode-then-escape, auto-respawn#147
Open
wowa1990 wants to merge 6 commits into
Open
Spotify-Control & mplayer Robustness — live-reload, decode-then-escape, auto-respawn#147wowa1990 wants to merge 6 commits into
wowa1990 wants to merge 6 commits into
Conversation
Two related-but-distinct issues in the slave-protocol shim that sits
between backend-player and mplayer.
1. Slave-protocol injection (HIGH-7). exec() built a command line and
ran it through jsStringEscape() to escape `"`, newlines, and
backslashes so a caller-controlled value (e.g. a track URL) could
not break out of the quoted argument. Then the very next line did
`str = decodeURIComponent(str)`, which UNDOES any `%22`, `%0A` or
`%5C` the caller wedged in there. Net effect: a crafted URL like
`http://attacker/file%22%0Astop%0A` decoded back to a literal `"`
and newline AFTER escaping, injecting `stop` into the slave socket.
Fix: keep decodeURIComponent (legitimate callers pass %xx-encoded
paths from m3u resolvers, RSS feeds and the spotify-control
playList helper — without decoding mplayer can't open them) but
move it BEFORE jsStringEscape. Decoding-then-escaping preserves
both halves: the caller still gets `Foo Bar/01.mp3` from `Foo%20Bar/
01.mp3`, and a malicious `Foo%22%0Astop%0A` decodes to `Foo"\nstop\n`
which jsStringEscape then neutralises into `Foo\"\\nstop\\n` —
mplayer sees the escaped form, no slave-protocol injection. Wrap
the decode in a try/catch since decodeURIComponent throws URIError
on malformed `%FF` sequences.
2. No process recovery (MED-4). spawn() ran exactly once at module
load, handled only the 'close' event, and ignored 'error' and
stdin 'error'. So:
- if mplayer crashed (SIGSEGV / OOM / decoder bug), every
subsequent exec() wrote to a closed stdin and crashed the
entire backend-player with EPIPE
- if mplayer's binary went missing or fork() failed, the spawn
error went uncaught and tore down the process
Wrap the spawn in spawnMplayer() and attach error / stdin-error
listeners. On unexpected close, schedule a respawn with capped
exponential backoff (1, 2, 4, 8, 16, 30, 30…s). Reset the counter
once a fresh process has been alive for 30s. The 'close' event
still fires on intentional close() so existing teardown semantics
are preserved (new `shutdown` flag distinguishes intentional from
crash). Commands issued during the crash-respawn window are
debug-logged and dropped instead of throwing — listeners can react
to the new 'mplayer-crash' event.
…stale meta (MED-1)
The volume-up handler had a TOCTOU race that let fast touchscreen
taps push playback past muPiBoxConfig.mupibox.maxVolume — the
Hörschutz cap parents rely on. The bug shape:
exec(cmdVolume, callback) // async amixer-read, NOT awaited
if (currentMeta.volume < muPiBoxConfig.mupibox.maxVolume) {
cmdCall(volumeUp)
currentMeta.volume += 5
}
The exec was fired but the callback (which updates currentMeta.volume)
ran later, while the if-check immediately used currentMeta.volume from
whatever value the LAST setVolume left it at. With rapid taps:
tap 1: cmd reads start, sync check sees stale volume X < cap → +5,
currentMeta.volume = X+5 (locally, optimistic)
tap 2 (before tap-1's read returns): same path, sees X+5 < cap → +5,
currentMeta.volume = X+10
tap 3: …
The actual amixer state climbs, but the local-optimistic view stays
under the cap until reads finally catch up. Net: cap exceeded.
Fix:
1. Promisify the amixer-read so each setVolume call awaits the
ACTUAL hardware volume before deciding.
2. Add a serial queue (`_volumeOpQueue`) so concurrent setVolume
invocations chain rather than overlap. Each chained op reads,
checks, writes atomically with respect to other queued ops.
3. Bound the local update with Math.min/Math.max so even if amixer
misreports, currentMeta stays in [0, maxVolume].
4. Catch failures inside the chained .then so one bad op (e.g.
amixer read failed during boot) doesn't poison the queue.
err.body was dereferenced unconditionally even though .error?.status had optional chaining; for any non-API error (DNS failure, connection drop, JSON parse error in the SDK, token-refresh HTTP failure) the handler itself crashed with TypeError, taking the whole backend-player process down via pm2 unhandledException. Add optional chaining on err and body.
chromium-autostart.sh fires from two paths that race:
1) restart_kiosk.sh after the admin "Restart services" click
2) dietpi-login auto-respawn on tty2 once the previous chromium
browser process dies
Without a guard each invocation runs its own `mplayer ${START_SOUND} &`
in the background. Live monitoring during a single user click showed
2 chromium-autostart.sh instances and 3 mplayer-startup.wav instances
running concurrently — the user reported the boot sound layered on
top of itself.
pkill any prior mplayer that's still playing the same wav before
launching a fresh one. The `${START_SOUND}` path (the wav) is unique
enough that the regex never matches the backend-player's slave-mode
mplayer (which is invoked without a media-file path on its argv).
The BQ25792 driver previously raised I2CError on any OSError from the underlying smbus2 call. On a Pi the typical OSError is errno 121 "Remote I/O error" which occurs when an I2C transaction gets interrupted mid-byte (transient bus glitch, interleaved transaction). After such an error the bus is in an indeterminate state — subsequent reads also fail until the SMBus handle is re-opened. Without the auto-reconnect a single Remote-I/O-error knocks the mupihat daemon into a stale state: the periodic_json_dump cycle keeps logging errors, /tmp/mupihat.json goes stale until the watchdog eventually resets the service. Add a new _reopen_bus() helper that closes-and-reopens the smbus2.SMBus handle. In safe_execute(), catch OSError specifically (not the broader Exception catch), call _reopen_bus(), and retry the operation exactly once. If the retry also fails the original I2CError escalation path runs as before. The retry rebinds the bound method to the new bus handle so the original `self.bq.foo` reference doesn't dangle on the closed handle. Bound-method detection via hasattr(func, '__self__'). Other Exceptions (non-OSError) keep the original behaviour: log and re-raise as I2CError.
AR5-5: mupihat.py runs the BQ25792 driver in three threads — the periodic_json_dump daemon plus two Flask request workers (`/` and `/api/registers`). Without serialisation they can interleave I2C transactions on the same smbus2.SMBus handle, producing the "[Errno 121] Remote I/O error" bursts visible in the mupi_hat journal (observed live during charging on 2026-05-15). The Pi's i2c-bcm2835 driver doesn't fully recover from a half-completed sequence, so even benign collisions leave the bus in a state where read_all_register returns garbage or fails outright. Fix: module-level threading.Lock in mupihat.py wrapping every hat.* call. periodic_json_dump now acquires the lock in three short phases around the watchdog reset, the bulk register read, and the to_json / log_register_values follow-up reads — sleeps stay outside the lock so request workers can fit between phases (no pathological 5-second waits). The JSON file write also moved outside the lock: it doesn't touch I2C and can take 5-30 ms on SD. M6: companion auto-reconnect in mupihat_bq25792.py safe_execute(). When a single I2C transaction throws OSError (errno 121 is the common case), the wrapper now closes the SMBus handle, re-opens it with smbus2.SMBus(self.i2c_device), and retries the operation once before escalating to I2CError. This clears the hung bus state that previously persisted until the next cycle (or until mupihat-daemon restart in worst-case watchdog scenarios). The retry rebinds the bound method to the new bus handle so the original `self.bq.foo` reference doesn't dangle on the closed handle. Behaviour preserved: timing identical (sleep(0.1) + sleep(1) + sleep(3.9) per cycle), all original error logging kept, no API surface change. Only the locking and the OSError-recovery path are new.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Was es macht
mupiboxconfig.jsonviafs.watch(Config-Änderungen brauchen kein Backend-Restart mehr)handleSpotifyErrorfängt jetzt auch Non-API-Errors (Network, parse, etc.)setVolumeserialisiert + liest amixer-State direkt statt aus Cache (verhindert Volume-Drift)mediaService.current$mehr — sonst bricht Spotify Connect)startup.wavDoppelton-Fix (chromium-autostart)Architektur (kurz)
Watcher hängt am Backend-Process — kein extra-Daemon. mplayer-Respawn via Exit-Code-Detection im Wrapper. I2C-Mutex ist ein
threading.Lockum die smbus-Calls, damit Read+Write nicht ineinander laufen, wenn HAT-Polling und Display-Update gleichzeitig kommen.Test in Codespaces / lokal
mupiboxconfig.json(z.B. Volume-Default) → erwartet log-Eintrag „config reloaded", neue Werte aktiv ohne RestartHinweis: mplayer-Auto-Respawn, I2C-Mutex, SMBus-Reconnect und Doppelton-Fix sind nur auf realer Box testbar (mplayer-Binary, mupihat-Hardware, pm2-Setup).
Offene Punkte
mvzweimal), gerne hochsetzen.