Summary
Bring LIST and other composite/nested types (tuple / struct) to first-class support across storage and runtime / ingestion, so that users can model and query nested data without workarounds.
Motivation
Today, support for nested types is fragmented:
- Storage:
LIST is partially modeled but lacks recursive component_type=list support; tuple / struct have no dedicated columnar layout (today they would have to be encoded inline, like strings, instead of being split into per-field columns).
- Ingestion (
load_data): behavior diverges per format — Parquet has the best LIST handling, JSON partially works, CSV essentially does not. Users hit format-specific cliffs when bringing nested data into the graph.
A unified umbrella lets us land the storage primitives once and have ingestion (and later, query/expression) consume them consistently.
Scope (in)
- Storage layer
- LIST type with recursive
component_type (LIST of LIST, etc.)
tuple / struct columnar layout (per-field columns; nested columns)
- Runtime / ingestion layer
load_data for CSV / JSON / Parquet — close the format gaps for LIST
- Round-trip via the columnar storage primitives above
Scope (out)
Child issues (tracked here)
Storage primitives
Ingestion (load_data) — moved from #221
Summary
Bring LIST and other composite/nested types (tuple / struct) to first-class support across storage and runtime / ingestion, so that users can model and query nested data without workarounds.
Motivation
Today, support for nested types is fragmented:
LISTis partially modeled but lacks recursivecomponent_type=listsupport;tuple/structhave no dedicated columnar layout (today they would have to be encoded inline, like strings, instead of being split into per-field columns).load_data): behavior diverges per format — Parquet has the best LIST handling, JSON partially works, CSV essentially does not. Users hit format-specific cliffs when bringing nested data into the graph.A unified umbrella lets us land the storage primitives once and have ingestion (and later, query/expression) consume them consistently.
Scope (in)
component_type(LIST of LIST, etc.)tuple/structcolumnar layout (per-field columns; nested columns)load_datafor CSV / JSON / Parquet — close the format gaps for LISTScope (out)
Child issues (tracked here)
Storage primitives
Listtype in storage. #159 — Introduce support forListtype in storage (with recursivecomponent_type)tuple/structin columnar format (per-field columns; nested columns)Ingestion (
load_data) — moved from #221