core
How to reproduce
I ported the Instant ADD COLUMN feature to Percona 5.7.44, and when I ran:
./mtr --suite=innodb_fts --retry=6 --retry-failure=6 --parallel=12 --force
it crashed in bug_32831765.test.
how to fix
Given that the clust_rec address could have been modified, either delete this line of code or substitute it with:
ut_ad(rec_offs_validate(NULL, clust_index, offsets));
Analysis
fts_add_doc_by_id
|--> mtr_start(&mtr);
|--> btr_pcur_init(&clust_pcur)
|--> btr_pcur_open_with_no_init(clust_index...clust_pcur)
|--> clust_rec = btr_pcur_get_rec(&clust_pcur);
|--> // step 1
|--> //根据主键上的clust_rec获取记录的offsets
|--> //此offsets包含了record中每个column的偏移信息,
|--> //其中debug模式下,offsets[2]=clust_rec, 即记录的地址,用于后面的debug校验
|--> //会将此结构传给后面的 fts_fetch_doc_from_rec
|--> offsets = rec_get_offsets(clust_rec, clust_index..)
|--> //遍历所有id
|--> /** for (ulint i = 0; i < num_idx; ++i) */
|--> fts_fetch_doc_from_rec(..,clust_index,..clust_pcur, offsets)
| |--> // step 3
| |--> //由于游标执行了restore,clust_rec_new未必等于上面的clust_rec
| |--> clust_rec_new = btr_pcur_get_rec(pcur)
| |--> ut_ad(rec_offs_validate(clust_rec_new, clust_index, offsets));
| | |--> //由于记录指针发生变化,导致crash
| | |--> ut_ad((ulint) clust_rec_new == offsets[2])
|--> btr_pcur_store_position(clust_pcur, &mtr);
|--> mtr_commit(&mtr);
|--> //step 2
|--> //由于中间释放了锁,doc_pcur对应的block可能发生了 split or merge
|--> //重新获取clust_pcur后,对应的 page_cur.rec 地址可能发生变化
|--> btr_pcur_restore_position(clust_pcur,mtr)
|--> /** end for loop */
core
How to reproduce
I ported the Instant ADD COLUMN feature to Percona 5.7.44, and when I ran:
./mtr --suite=innodb_fts --retry=6 --retry-failure=6 --parallel=12 --forceit crashed in bug_32831765.test.
how to fix
Given that the clust_rec address could have been modified, either delete this line of code or substitute it with:
ut_ad(rec_offs_validate(NULL, clust_index, offsets));
Analysis