Deadlock fix in MutableBTree::insert, SPR-949#594
Conversation
SPR-949 Deadlock in MutableBTree::insert()
It isn't easy to reproduce. The way I found it is by setting the secondary index extent size to 8K and loading a dataset with |
| // check if the parent needs to be flushed | ||
| if (node->page->check_flush()) { | ||
| if (!node->parent) { | ||
| root_to_flush = node; | ||
| break; | ||
| } | ||
| } else { | ||
| // will lock the parent's parent and then lock and flush the parent | ||
| _lock_and_flush_page(node); | ||
| // otherwise, update the parent's size in the cache | ||
| boost::unique_lock cache_lock(_cache->mutex); | ||
| _cache_update_size(node->page, cache_lock); | ||
| break; |
There was a problem hiding this comment.
It looks like in the case of a non-root branch that returns true from check_flush(), we are no longer performing a flush? That seems like a possible problem?
There was a problem hiding this comment.
I think before it was just returning from _lock_and_flush_page that was a recursive function.
There was a problem hiding this comment.
Now if check_flush() returns true, the loop continues.
There was a problem hiding this comment.
Ah, got it. Make sense 👍
|



This could happen when MutableBTree::insert() makes a call to _lock_and_flush_page() while holding the shared tree lock. In some cases _lock_and_flush_page() calls _lock_and_flush_root that would try to acquire the same lock exclusively.