Skip to content

Fix loss of root/meta blkno after self-invalidation; defer invalidation processing under page locks#55

Open
Serge-sudo wants to merge 104 commits into
orioledb:patches17from
Serge-sudo:fixed-invalid-meta-blocknum
Open

Fix loss of root/meta blkno after self-invalidation; defer invalidation processing under page locks#55
Serge-sudo wants to merge 104 commits into
orioledb:patches17from
Serge-sudo:fixed-invalid-meta-blocknum

Conversation

@Serge-sudo
Copy link
Copy Markdown
Contributor

Fix an issue where B-tree descriptor state (metaPageBlkno and rootPageBlkno) could be lost after processing an invalidation message within the same transaction.

Previously, after o_btree_load_shmem() initialized the descriptor, a invalidation message could be processed before the descriptor was used. This caused the descriptor to be recreated, resetting metaPageBlkno and rootPageBlkno to InvalidBlockNumber, which led to assertion failures (e.g. in btree_ctid_get_and_inc).

While investigating, another problem was identified: invalidation message processing may occur while holding page locks. Invalidation handlers access system trees, and if required pages are evicted, they trigger page loads. This leads to load_page() being called while locks are held, hitting assertions that forbid holding locks during page load.

To address both issues:

  • Ensure descriptor state is not lost due to mid-operation self-invalidation

  • Introduce deferred invalidation handling:

    • Skip invalidation message processing when it is unsafe (e.g. while holding page locks)
    • Process pending invalidation messages at the next safe point

This prevents descriptor corruption and avoids unsafe page loads under locks.

downlink -> orioledb/orioledb#792

akorotkov and others added 30 commits February 5, 2026 13:46
Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com
Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp
Reviewed-by: Andres Freund, Chris Travers
Snapshot have two pairing heap nodes: for data and system undos.
 * Added SearchCatCacheInternal_hook, SearchCatCacheList_hook
 * Added SysCacheGetAttr_hook
IsFatalError()
have_backup_in_progress()
SnapBuildNextPhaseAt()
DoLocalLockExist()
Outline-atomics is a gcc compilation flag that enables runtime detection
of CPU support for atomic instructions.
Performance on CPUs that do support atomic instructions is improved,
while compatibility and performance on CPUs without atomic instructions
is not hurt.

Discussion: https://postgr.es/m/flat/099F69EE-51D3-4214-934A-1F28C0A1A7A7%40amazon.com
Author: Tsahi Zidenberg
They are allowed to stay during shutdown checkpointing and help checkpointer
do its work.
To use curl during shared_preload_libraries initialization.
- added option --extension for pg_rewind
- extracted SimpleXLogRead from extractPageMap for generic wal iteration in pg_rewind
za-arthur and others added 25 commits February 5, 2026 13:59
pg_column_toast_chunk_id() is tied to PG TOAST pointer format and can't work with
OrioledDB TOAST pointer. So make it safely return NULL in this case.

PG17+ only

Fixes orioledb/orioledb#690
Instead of interperting all the extension wait events as process blocker,
only recognize "StopEvent" by its name.
Use tupleid for ROW_REF_TID as in upsteam.  Use &context->tmfd.ctid for
ROW_REF_ROWID as in OrioleDB.
…o_unset_syscache_hooks call

Single-user mode may run recovery and then start a session in the same
process. When OrioleDB is loaded, recovery initializes the catalog
cache, which caused session startup to fail because it expected a
non-initialized state.

Allow the catalog cache to be already initialized when starting a
session in single-user mode after recovery.

Additionally, fix a missing o_unset_syscache_hooks call, which left
OrioleDB syscache hooks installed across the recovery/session boundary
and contributed to crashes and inconsistent backend state.

This ensures that backend-global state is properly handled when
recovery and normal session execution happen within the same process.
The latest actions/checkout@v6 has improved credential security.
Also remove some changes for pg_regress and regress tests
Add extra argument indicating commit/abort.
It's needed to count custom extension data for database that is stored in
PG datadir or pg_tblspc dir but outside of <dbOid> dir.

Size counted by a hook is added to database size counted by existing method
…on processing under page locks

Fix an issue where B-tree descriptor state (`metaPageBlkno` and `rootPageBlkno`) could be lost after processing an invalidation message within the same transaction.

Previously, after `o_btree_load_shmem()` initialized the descriptor, a invalidation message could be processed before the descriptor was used. This caused the descriptor to be recreated, resetting `metaPageBlkno` and `rootPageBlkno` to `InvalidBlockNumber`, which led to assertion failures (e.g. in `btree_ctid_get_and_inc`).

While investigating, another problem was identified: invalidation message processing may occur while holding page locks. Invalidation handlers access system trees, and if required pages are evicted, they trigger page loads. This leads to `load_page()` being called while locks are held, hitting assertions that forbid holding locks during page load.

To address both issues:

* Ensure descriptor state is not lost due to mid-operation self-invalidation
* Introduce deferred invalidation handling:

  * Skip invalidation message processing when it is unsafe (e.g. while holding page locks)
  * Process pending invalidation messages at the next safe point

This prevents descriptor corruption and avoids unsafe page loads under locks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.