Cache LOB values in ReselectColumnPostProcessor 

# Feature request or enhancement

## Which use case/requirement will be addressed by the proposed feature?

Add an **optional, pluggable cache** to the `ReselectColumnsPostProcessor` so that re-selected column
values can be reused across change events instead of issuing a database query for every event.

### Background

The reselect post-processor re-queries column values when an event carries the *unavailable value
placeholder* (e.g. unchanged PostgreSQL TOAST columns, Oracle LOBs) or a `null` the user wants
refreshed. Today **every** qualifying event triggers a `SELECT` against the source database.

For workloads with large out-of-line columns that change rarely but sit on frequently-updated rows,
this produces a high volume of repeated, identical re-selection queries for the same `(table, key,
column)` tuple. On managed databases (e.g. Aurora endpoints) this adds measurable load and latency to
the streaming pipeline, even though the re-selected value has not changed between events.

### Proposed behaviour

- Cache exposed behind a **strategy interface** (`ReselectColumnCache`) so alternative backends (e.g.
  an embedded key/value store such as RocksDB, or a user-custom implementation) can be plugged in
  without changing the post-processor. A default in-memory implementation is provided and selected via
  `reselect.cache.type`.
- The cache is organised **hierarchically** as `table -> row -> column`. The post-processor resolves a
  row-scoped handle once per event (`forRow(tableId, keyValues)`), so the table and row-key identity
  are serialized a single time and reused for every column of that row — important for tables with
  multiple LOB/TOAST columns.
- On a qualifying event, look up each required column in the cache; only the **misses** are queried,
  in a single `SELECT`. Hits are served from the cache.
- **Invalidate-on-modify**: whenever an event carries a *real* (non-placeholder) value for a column,
  its cache entry is evicted. This guarantees a later placeholder event for the same row never serves
  a value that has since changed — so cache correctness does **not** depend on the TTL.
- The default in-memory cache is bounded by size (LRU on the number of cached rows, via the existing
  `io.debezium.util.BoundedConcurrentHashMap`) and by a per-value TTL that only bounds memory retention
  for cold rows.

### Configuration (all optional, cache disabled by default)

| Property | Default | Description |
|---|---|---|
| `reselect.cache.enabled` | `false` | Enable the re-selection cache. |
| `reselect.cache.type` | `io.debezium.processors.reselect.cache.MemoryReselectColumnCache` | The `ReselectColumnCache` implementation class. |
| `reselect.cache.max.size` | `10000` | Maximum number of cached rows (LRU eviction). Default in-memory cache only. |
| `reselect.cache.ttl.ms` | `600000` | Entry time-to-live in milliseconds. Default in-memory cache only. |

When `reselect.cache.enabled=false` (the default) the post-processor behaves exactly as before — no
behavioural change for existing users.

### Why this is safe

Re-selection normally exists to fetch the **latest** committed state, so naive caching would risk
emitting stale data. The invalidate-on-modify rule closes that gap: any event that actually changes a
column refreshes/evicts its cached value before the next placeholder event can read it. The TTL is
therefore a memory bound, not the correctness mechanism, which is why a relatively long default
(10 minutes) is acceptable.

### Scope / non-goals

- No new third-party dependency — the default cache reuses `io.debezium.util.BoundedConcurrentHashMap`
  already used across the connector-common module.
- No distributed/shared cache by default — the in-memory implementation is per-task and in-process.
  Shared/persistent backends can be added as separate `ReselectColumnCache` strategies.

---

## Implementation ideas (optional)

- Strategy interface `io.debezium.processors.reselect.cache.ReselectColumnCache` with a row-scoped
  `RowCache` handle, plus a default `MemoryReselectColumnCache`, in `debezium-connector-common`.
- The in-memory implementation uses a hierarchical `row -> (column -> value)` structure so the row
  identity is serialized once per event and shared across all of that row's columns.
- Cached values are the **raw** JDBC values; the existing `getConvertedValue(...)` conversion is
  applied uniformly to both cache hits and freshly queried values, so caching does not interfere with
  value conversion.
- `byte[]` key values are rendered by content (not identity) so binary primary keys produce stable,
  collision-free cache keys.
- Happy to adjust naming, defaults, or split the change as the maintainers prefer.


Property	Default	Description
`reselect.cache.enabled`	`false`	Enable the re-selection cache.
`reselect.cache.type`	`io.debezium.processors.reselect.cache.MemoryReselectColumnCache`	The `ReselectColumnCache` implementation class.
`reselect.cache.max.size`	`10000`	Maximum number of cached rows (LRU eviction). Default in-memory cache only.
`reselect.cache.ttl.ms`	`600000`	Entry time-to-live in milliseconds. Default in-memory cache only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cache LOB values in ReselectColumnPostProcessor #2002

Feature request or enhancement

Which use case/requirement will be addressed by the proposed feature?

Background

Proposed behaviour

Configuration (all optional, cache disabled by default)

Why this is safe

Scope / non-goals

Implementation ideas (optional)

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Cache LOB values in ReselectColumnPostProcessor #2002

Description

Feature request or enhancement

Which use case/requirement will be addressed by the proposed feature?

Background

Proposed behaviour

Configuration (all optional, cache disabled by default)

Why this is safe

Scope / non-goals

Implementation ideas (optional)

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions