Skip to content

CI run#302

Open
aversecat wants to merge 10 commits intomainfrom
auke/make_ci_green_again
Open

CI run#302
aversecat wants to merge 10 commits intomainfrom
auke/make_ci_green_again

Conversation

@aversecat
Copy link
Copy Markdown
Contributor

Various "trial-and-error" fixes accumulated by trying to diagnose and handle our unmount failures in CI.

Not for review yet.

@aversecat aversecat added the WIP label Apr 15, 2026
aversecat and others added 10 commits April 16, 2026 13:03
When block_submit_bio() fails, set BLOCK_BIT_ERROR so that
waiters in wait_event(uptodate_or_error) will wake up rather
than waiting indefinitely for a completion.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Replace unbounded wait_for_completion() with a 120 second timeout
to prevent indefinite hangs during unmount if the server never
responds to the farewell request.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Add unmounting checks to lock_wait_cond() and lock_key_range() so
that lock waiters wake up and new lock requests fail with -ESHUTDOWN
during unmount. Replace the unbounded wait_event() with a 60 second
timeout to prevent indefinite hangs. Relax the WARN_ON_ONCE at
lock_key_range entry to only warn when not unmounting, since late
lock attempts during shutdown are expected.

Signed-off-by: Auke Kok <auke.kok@versity.com>
The "server error emptying freed" error was causing a
fence-and-reclaim test failure. In this case, the error
was -ENOLINK, which we should ignore for messaging purposes.

Signed-off-by: Chris Kirby <ckirby@versity.com>
Replace unbounded wait_for_completion() in scoutfs_net_sync_request()
with a 60 second timeout loop that checks scoutfs_unmounting(). Cancel
the queued request before returning -ESHUTDOWN so that sync_response
cannot fire on freed stack memory after the caller returns.

Signed-off-by: Auke Kok <auke.kok@versity.com>
During normal unmount, lock_invalidate_worker can hang in
scoutfs_trans_sync(sb, 1) because the trans commit path may
return network errors that cause an infinite retry loop.

Skip full lock_invalidate() during shutdown and unmount, and
extract lock_clear_coverage() to still clean up coverage items
in those paths and in scoutfs_lock_destroy().  Without this,
coverage items can remain attached to locks being freed.

Signed-off-by: Auke Kok <auke.kok@versity.com>
retry_forever() only checked scoutfs_forcing_unmount(), so a normal
unmount with a network error in the commit path would loop forever.
Also check scoutfs_unmounting() so the write worker can exit cleanly.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Add a WARN_ON_ONCE check that the freed list ref blkno matches the
block header blkno after dirtying alloc blocks.  Also save and restore
freed.first_nr on the error path, and initialize av_old/fr_old to 0
so the diagnostic message has valid values.

Signed-off-by: Auke Kok <auke.kok@versity.com>
block_dirty_ref() skipped setting *ref_blkno when the block was
already dirty, leaving the caller with a stale value.  Set it to 0
on the already-dirty fast path so callers do not try to free a
block that was not allocated.

Signed-off-by: Auke Kok <auke.kok@versity.com>
Replace the unbounded wait_event() in block_read() with a 120
second timeout that issues a WARN if the bio completion never
arrives.  A lost completion would otherwise hang silently.

Signed-off-by: Auke Kok <auke.kok@versity.com>
@aversecat aversecat force-pushed the auke/make_ci_green_again branch from 9f337b7 to a46e701 Compare April 16, 2026 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants