Skip to content

fix: rows and columns with RichTableCell#653

Open
PeterStaar-IBM wants to merge 1 commit into
mainfrom
fix/row-column-bbox-with-rich-table-cell
Open

fix: rows and columns with RichTableCell#653
PeterStaar-IBM wants to merge 1 commit into
mainfrom
fix/row-column-bbox-with-rich-table-cell

Conversation

@PeterStaar-IBM

@PeterStaar-IBM PeterStaar-IBM commented Jun 22, 2026

Copy link
Copy Markdown
Member

What

TableData.get_row_bounding_boxes / get_column_bounding_boxes only looked at cell.bbox. Since TableCell.bbox is
optional and a RichTableCell typically has bbox=None (its content lives behind ref), those cells were silently
skipped — so any row or column containing a rich cell got a bounding box that ignored the rich cell's actual
extent, or was omitted entirely.

How

Mirroring the existing _get_text pattern, this adds a _get_bbox method:

  • TableCell._get_bbox(doc=None) returns self.bbox.
  • RichTableCell._get_bbox(doc=None) override: an explicit bbox still wins; otherwise, when a doc is supplied, it
    resolves self.ref, walks the referenced subtree, and returns the union (BoundingBox.enclosing_bbox) of all
    descendant provenance boxes. Returns None when no doc / no provenance is available.

get_row_bounding_boxes / get_column_bounding_boxes gain an optional doc keyword and resolve each cell's bbox once
up front via _get_bbox(doc=doc). TableVisualizer forwards doc so the drawn row/column overlays pick up the
resolved boxes.

Compatibility

Backward compatible: doc defaults to None, so existing callers that pass no doc keep current (own-bbox-only)
behavior.

Tests

Adds test_table_data_bounding_boxes_with_rich_table_cell, which builds a table mixing a RichTableCell (bbox=None,
ref → group of text items with prov boxes) and a plain cell, and asserts:

  • _get_bbox returns None without a doc and the correct union with one;
  • without doc the rich cell is skipped (row covers only the plain cell, rich-only column absent);
  • with doc the row/column boxes enclose the referenced extent;
  • an explicit bbox on the rich cell takes precedence.

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.11765% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
docling_core/types/doc/document.py 93.75% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

Copy link
Copy Markdown
Contributor

DCO Check Passed

Thanks @PeterStaar-IBM, all your commits are properly signed off. 🎉

@mergify

mergify Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Merge Protections

🔴 1 of 2 protections blocking · waiting on 👀 reviews

Protection Waiting on
🔴 Require two reviewer for test updates 👀 reviews
🟢 Enforce conventional commit

🔴 Require two reviewer for test updates

Waiting for

  • #approved-reviews-by >= 2
This rule is failing.

When test data is updated, we require two reviewers

  • #approved-reviews-by >= 2

Show 1 satisfied protection

🟢 Enforce conventional commit

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant