Skip to content

Fix sample printer#235

Open
voegtlel wants to merge 1 commit into
developfrom
feature/fix_sample_printer
Open

Fix sample printer#235
voegtlel wants to merge 1 commit into
developfrom
feature/fix_sample_printer

Conversation

@voegtlel
Copy link
Copy Markdown
Collaborator

Indentation was broken. Print wasn't really well readable

@voegtlel voegtlel force-pushed the feature/fix_sample_printer branch from bffa109 to b1c7bbc Compare May 20, 2026 18:51
@voegtlel voegtlel requested a review from philipp-fischer May 20, 2026 18:53
@philipp-fischer
Copy link
Copy Markdown
Collaborator

I don't really like the very custom branch-by-branch indentation algorithm. Also it would break for (nested) dataclasses.
I brainstormed with codex and came up with a clean and simple approach. Curious what you think about it:

The idea is to first convert into a clean tree-like structure and then render as ordinary yaml.

I am pasting the rather short code here:
Step 1:

def to_printable(value):
    if isinstance(value, torch.Tensor):
        return summarize_tensor(value)

    if isinstance(value, np.ndarray):
        return summarize_array(value)

    if dataclasses.is_dataclass(value):
        return {
            type(value).__name__: {
                field.name: to_printable(getattr(value, field.name))
                for field in dataclasses.fields(value)
            }
        }

    if isinstance(value, dict):
        items = list(value.items())
        out = {k: to_printable(v) for k, v in items[:25]}
        if len(items) > 25:
            out["..."] = f"and {len(items) - 25} more items"
        return out

    if isinstance(value, (list, tuple)):
        out = [to_printable(v) for v in value[:10]]
        if len(value) > 10:
            out.append(f"... and {len(value) - 10} more items")
        return out

    if isinstance(value, str):
        return value[:1000] + "... (truncated)" if len(value) > 1000 else value

    if isinstance(value, (int, float, bool, type(None))):
        return value

    return repr(value)[:200]

Step 2 (render):

yaml.safe_dump(
    to_printable(sample),
    sort_keys=False,
    default_flow_style=False,
    width=120,
).rstrip()

where summarize_tensor etc. are similar to existing summarizing code for these classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants