Skip to content

[bug] ClickHouse adapter does not surface column-level comment from system.columns #706

@dennisUhrskov

Description

@dennisUhrskov

Produced with Claude Code

ClickHouse natively supports column-level COMMENT metadata via ALTER TABLE … COMMENT COLUMN '…'. dbt's +persist_docs: {columns: true} writes schema.yml column descriptions into this field. nao's description.md.j2 and columns.md.j2 templates both reference col.description, so the rendering side is ready — but the adapter never fetches it, so the field is always None.

Where the data is dropped

File nao_core/config/databases/clickhouse.py:

  1. _columns_from_system (line 324) selects name, type, default_kind, default_expression only — it omits comment. Then sets description: None on every returned column dict (line 341).
  2. ClickHouseDatabaseContext.columns() (line 453) builds its column dicts via self.table.schema() (an ibis call that returns no comment metadata) and hardcodes "description": None (line 464). It then merges in _columns_from_system for type/default details, but again no comment is read.

Repro

  1. Build a dbt model with column descriptions in schema.yml and +persist_docs: {columns: true}.
  2. Verify in ClickHouse: SELECT name, comment FROM system.columns WHERE database = '…' AND table = '…' returns populated comments.
  3. Run nao sync -p databases.
  4. Inspect databases/...//columns.md: every column line is rendered without the , "" suffix because col.description is None.

    Suggested fix

    In _columns_from_system, add comment to the SELECT list and to the returned dict:

    sql = (
    "SELECT name, type, default_kind, default_expression, comment "
    f"FROM system.columns WHERE database = '{d}' AND table = '{t}' ORDER BY position"
    )
    ...
    return [
    {
    "name": r["name"],
    "type": str(r.get("type", "")),
    "nullable": "Nullable" in str(r.get("type", "")),
    "description": (str(r.get("comment", "")).strip() or None),
    "default_kind": str(r.get("default_kind", "")).strip() or None,
    "default_expression": str(r.get("default_expression", "")).strip() or None,
    }
    for r in rows
    ]

    Then in ClickHouseDatabaseContext.columns(), when merging system metadata back into the ibis-derived column list, also propagate the description:

    descriptions = {col["name"]: col.get("description") for col in system_columns}
    ...
    for col in cols:
    ...
    if name in descriptions:
    col["description"] = descriptions[name]

    Adjacent observation (lower priority)

    The DatabaseTemplate enum no longer includes DESCRIPTION, and _migrate_accessors_to_templates silently strips "description" from user configs (base.py:119). However, the description.md.j2 template still ships in the package, and projects upgrading from older versions retain stale description.md files in their databases/ tree because the cleanup step only removes whole-table directories that disappear from the warehouse, not orphaned per-template files within a still-existing table directory. Either delete the now-unreachable template file or have the cleanup step prune files that don't correspond to a currently-enabled template.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions