Skip to content

[Tracking] Support LIST and composite types (storage + runtime) #446

@longbinlai

Description

@longbinlai

Summary

Bring LIST and other composite/nested types (tuple / struct) to first-class support across storage and runtime / ingestion, so that users can model and query nested data without workarounds.

Motivation

Today, support for nested types is fragmented:

  • Storage: LIST is partially modeled but lacks recursive component_type=list support; tuple / struct have no dedicated columnar layout (today they would have to be encoded inline, like strings, instead of being split into per-field columns).
  • Ingestion (load_data): behavior diverges per format — Parquet has the best LIST handling, JSON partially works, CSV essentially does not. Users hit format-specific cliffs when bringing nested data into the graph.

A unified umbrella lets us land the storage primitives once and have ingestion (and later, query/expression) consume them consistently.

Scope (in)

  • Storage layer
    • LIST type with recursive component_type (LIST of LIST, etc.)
    • tuple / struct columnar layout (per-field columns; nested columns)
  • Runtime / ingestion layer
    • load_data for CSV / JSON / Parquet — close the format gaps for LIST
    • Round-trip via the columnar storage primitives above

Scope (out)

Child issues (tracked here)

Storage primitives

Ingestion (load_data) — moved from #221

Metadata

Metadata

Labels

compilerCompiler infrastructureengineStorage and execution enginestoreStorage layer
No fields configured for Feature.

Projects

Status
To do

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions