Skip to content

feat(nntp): playback streaming timeout fast-fail and retry logic#399

Open
hoivikaj wants to merge 1 commit into
nzbdav-dev:mainfrom
hoivikaj:issue-383
Open

feat(nntp): playback streaming timeout fast-fail and retry logic#399
hoivikaj wants to merge 1 commit into
nzbdav-dev:mainfrom
hoivikaj:issue-383

Conversation

@hoivikaj
Copy link
Copy Markdown
Contributor

@hoivikaj hoivikaj commented Apr 10, 2026

Reduce choppy streaming playback caused by NNTP segment timeouts

Problem

During WebDAV streaming playback, if a Usenet provider stalls mid-transfer on a segment, UsenetSharp's internal read timeout takes ~40 seconds to fire. With only 1 retry (also subject to the same ~40s timeout), a single slow segment can stall the media player for up to ~80 seconds - causing visible choppiness or complete playback interruption.

The download/queue paths can tolerate this latency, but real-time streaming cannot.

Solution

Introduce a per-attempt timeout and increased retry count specifically for WebDAV streaming requests, without affecting download/queue behavior.

When a file is streamed via WebDAV, a StreamingTimeoutContext is attached to the cancellation token (using the same context propagation pattern as the existing DownloadPriorityContext). Inside MultiConnectionNntpClient.RunWithConnection, if this context is present, each NNTP command attempt is wrapped with a tighter CancelAfter that fires well before UsenetSharp's internal timeout. If the per-attempt timeout fires but the caller's token is still alive, the stale connection is replaced and the command is retried immediately on a fresh connection.

Before: ~40s timeout × 2 attempts = ~80s worst-case stall per segment
After: ~8s timeout × 4 attempts = ~32s worst-case, with each retry getting a fresh connection

Configuration

Two new settings (with sensible defaults):

Setting Default Description
usenet.streaming-segment-timeout 8 (seconds) Per-attempt timeout for streaming segment fetches
usenet.streaming-segment-retries 3 Max retries per segment during streaming

These only apply to WebDAV streaming paths. Download queue, health checks, and other non-streaming operations are unaffected.

Changes

  • New: StreamingTimeoutContext - carries timeout config through the cancellation token context
  • Modified: MultiConnectionNntpClient.RunWithConnection - applies per-attempt timeout and streaming retry count when context is present
  • Modified: BaseStoreStreamFile - sets StreamingTimeoutContext at the WebDAV streaming entry point
  • Modified: ContextualCancellationTokenSource - propagates StreamingTimeoutContext through linked tokens
  • Modified: ConfigManager - exposes the two new config settings
  • Modified: DatabaseStoreNzbFile, DatabaseStoreRarFile, DatabaseStoreMultipartFile- pass ConfigManager to base class

Fixes #383

Signed-off-by: hoivikaj <github@hoivik.co>
@hoivikaj hoivikaj marked this pull request as ready for review April 10, 2026 00:38
@hoivikaj
Copy link
Copy Markdown
Contributor Author

@nzbdav-dev Thoughts?

loambit pushed a commit to loambit/nzbdav that referenced this pull request May 25, 2026
…+ breaker patch)"

This reverts commit 5e18568, reversing
changes made to a534dfc.
@loambit
Copy link
Copy Markdown
Contributor

loambit commented May 26, 2026

Just a heads up that that this change can lock up nzbdav/rclone until both are restarted (encountered several times while testing the branch). The real fix would need to happen in UsenetSharp, this can't be fixed on the nzbdav side alone.

What happens is when the new timeout fires and the code throws away the connection, the body read that's still running underneath hits a socket error a second later. Its cleanup tries to release a lock that just got thrown away, which crashes the cleanup before it gets to the line that gives the connection slot back to nzbdav. Every timeout leaks one slot from the pool. After enough of them during streaming, the pool is empty, new requests wait forever, and nzbdav stops sending bytes. If you have a slow provider in your list you'll hit it faster. The health endpoint stays green, the connection to rclone stays open but nothing is flowing through it, no NNTP traffic going out, and reads from the mount just hang. Restarting nzbdav alone doesn't fix it.

In UsenetSharp, UsenetClient.BodyAsync sets up an onFinally callback that runs at the end of ReadBodyToPipeAsync:

  delegate {
      pipe.Writer.Complete();
      _commandLock.Release();
      onConnectionReadyAgain?.Invoke(ArticleBodyResult.Retrieved);
  }

When this change calls connectionLock.Replace() and Dispose(), CleanupConnection disposes _commandLock. The body read then errors out, the callback runs, pipe.Writer.Complete() is fine, but _commandLock.Release() throws ObjectDisposedException and the callback never gets to onConnectionReadyAgain.Invoke(Retrieved). The body read is fire and forget from BodyAsync's caller so the error just disappears. DownloadingNntpClient._semaphore.Release() lives inside the OnConnectionReadyAgain chain that never runs, and that's where the slot goes missing. To make this work the body read would have to be cancelled cleanly so its cleanup runs while _commandLock is still alive, instead of having the connection thrown away while it's still using it. But ReadBodyToPipeAsync uses UsenetClient's own internal token, not the caller's, so from nzbdav there's no way to actually stop the body read, only ignore it, and that's why the fix has to be in UsenetSharp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0.6.2 - choppy playback with error messages in logs

2 participants