Skip to content

Deadlock fix in MutableBTree::insert, SPR-949#594

Merged
garthgoodson merged 5 commits into
mainfrom
spr-949-insert-deadlock
Aug 16, 2025
Merged

Deadlock fix in MutableBTree::insert, SPR-949#594
garthgoodson merged 5 commits into
mainfrom
spr-949-insert-deadlock

Conversation

@egladysh

Copy link
Copy Markdown
Collaborator

This could happen when MutableBTree::insert() makes a call to _lock_and_flush_page() while holding the shared tree lock. In some cases _lock_and_flush_page() calls _lock_and_flush_root that would try to acquire the same lock exclusively.

@linear

linear Bot commented Aug 15, 2025

Copy link
Copy Markdown
SPR-949 Deadlock in MutableBTree::insert()

It isn't easy to reproduce. The way I found it is by setting the secondary index extent size to 8K and loading a dataset with ./init_dataset.sh -d -f ./dataset/sales_index

Comment on lines +1165 to +1175
// check if the parent needs to be flushed
if (node->page->check_flush()) {
if (!node->parent) {
root_to_flush = node;
break;
}
} else {
// will lock the parent's parent and then lock and flush the parent
_lock_and_flush_page(node);
// otherwise, update the parent's size in the cache
boost::unique_lock cache_lock(_cache->mutex);
_cache_update_size(node->page, cache_lock);
break;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like in the case of a non-root branch that returns true from check_flush(), we are no longer performing a flush? That seems like a possible problem?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think before it was just returning from _lock_and_flush_page that was a recursive function.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now if check_flush() returns true, the loop continues.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. Make sense 👍

@egladysh egladysh requested a review from garthgoodson August 15, 2025 22:54
@garthgoodson garthgoodson merged commit 493f416 into main Aug 16, 2025
1 check passed
@garthgoodson garthgoodson deleted the spr-949-insert-deadlock branch August 16, 2025 03:57
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants