Skip to content

Fix spinlock race condition (rwlock)#560

Open
trim-the-main wants to merge 1 commit intohawkw:mainfrom
trim-the-main:patch-1
Open

Fix spinlock race condition (rwlock)#560
trim-the-main wants to merge 1 commit intohawkw:mainfrom
trim-the-main:patch-1

Conversation

@trim-the-main
Copy link

Readers try to acquire the lock using fetch_add(2) (least significant bit is reserved for the writers) and they check if a writer exists after they perform the addition. If there is a writer they undo their change using fetch_sub(2) . However if the writer unlocks before the fetch_sub(2) of the reader, setting the state to UNLOCKED = 0, then the subsequent fetch_sub of the reader ends up corrupting the state, overflowing and wrapping around.

With this change the unlock_exclusive just clears the writer bit from the state, leaving the reader counts unchanged. The caller must hold the lock so the state must have the writer bit set prior.

$ RUSTFLAGS="--cfg loom" cargo test blocking::rwlock
   
running 2 tests
test blocking::rwlock::tests::read_write ... ok
test blocking::rwlock::tests::write ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 43 filtered out; finished in 0.00s```

Readers try to acquire the lock using `fetch_add(2)` (least significant bit is reserved for the writers) and they check if a writer exists after they perform the addition. If there is a writer they undo their change using `fetch_sub(2)` . However if the writer unlocks before the `fetch_sub(2)` of the reader, setting the state to `UNLOCKED = 0`, then the subsequent `fetch_sub` of the reader ends up corrupting the state, overflowing and wrapping around.

With this change the `unlock_exclusive` just clears the writer bit from the state, leaving the reader counts unchanged. The caller must hold the lock so the state must have the writer bit set prior.
@trim-the-main
Copy link
Author

trim-the-main commented Feb 5, 2026

I went back to this to see how the loom tests did not catch this. I realized that LOOM_MAX_PREEMPTIONS=2 is not enough to hit this, the test blocking::rwlock::tests::read_write fails with 3 preemptions. I can suggest either bumping this limit to 3 which comes with longer test runs. Or add another test which requires one less preemption to hit this case. Something like:

// Following tests
//     blocking::rwlock::tests::write
//     blocking::rwlock::tests::read_write

#[test]
fn write_read() {
    loom::model(|| {
        let lock = Arc::new(RwLock::<usize>::new(0));
        let r_thread = {
            let reader_lock = lock.clone();
            thread::spawn(move || {
                let guard = reader_lock.read();
                assert!(*guard == 0 || *guard == 1);
            })
        };
        writer(lock)();

        r_thread.join().expect("reader thread mustn't panic")
    });
}

I just changed the order, writer is in the main thread, reader is spawn. This way it requires one less preemption to hit the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant