Skip to content

Auto-recover a hung preamp on persistent I2C write failures#1098

Open
stamateviorel wants to merge 1 commit into
micro-nova:mainfrom
stamateviorel:fix/preamp-i2c-auto-recovery
Open

Auto-recover a hung preamp on persistent I2C write failures#1098
stamateviorel wants to merge 1 commit into
micro-nova:mainfrom
stamateviorel:fix/preamp-i2c-auto-recovery

Conversation

@stamateviorel

Copy link
Copy Markdown

What does this change intend to accomplish?

When the preamp microcontroller hangs it stops ACKing and every I2C write fails with OSError 121 (EREMOTEIO). The existing fallback in _Preamps.write_byte_data only reopens the SMBus handle, which recovers a transient bus glitch but not a hung preamp — zone control stays dead until someone power-cycles the unit. This escalates the fallback: when the reopened-bus retry also fails, reset the preamps in place, re-assign I2C addresses, reopen the bus, re-flush all cached register values (so zone mute/source/volume state survives the reset), then retry the write. Rate-limited to once per 20 s so a benign one-off glitch never resets audio.

We hit this live on 2026-06-04 (all zone control dead, reads OK / writes failing with Errno 121, only a reboot helped). With this patch deployed the same wedge self-heals in under a second.

Checklist

  • Have you tested your changes and ensured they work? (in production on a real AmpliPi since 2026-06-04)
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • If applicable, have you updated the CHANGELOG?
  • Does your submission pass linting & tests? (python -m py_compile clean; happy to fix anything CI flags)

When the preamp microcontroller hangs it stops ACKing and every I2C
write fails with OSError 121 (EREMOTEIO). The existing fallback only
reopens the SMBus handle, which recovers a transient bus glitch but not
a hung preamp - zone control stays dead until someone power-cycles the
unit. Escalate: when the reopened-bus retry also fails, reset the
preamps in place, re-assign I2C addresses, reopen the bus and re-flush
all cached register values so zone state (mute/source/volume) survives,
then retry the write. Rate-limited to once per 20s so a benign one-off
glitch never resets audio.

Observed live on our unit 2026-06-04 (zone control dead until manual
reboot); with this patch the same wedge self-heals in under a second.

Signed-off-by: Stamate Viorel <stamate.viorel@gmail.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant