Summary
The object_fields measurement primitive (see
product/specs/inspector-layers-spec.md)
builds a per-field data dictionary over a set of frontmatter objects: presence,
type histogram, cardinality, and the most common values.
Its first cut characterizes scalar values only (keeping string and numeric
scalars distinct). Array and nested-object values are counted as present and
typed, but not characterized further. This issue tracks deepening that.
Motivation
Array fields are common in wikis — tags: [a, b, c] is the obvious case — and
today they report only "present, array," missing the element-value distribution
an agent would want to spot an enum-of-tags. Nested objects (meta: {…}) are
similarly opaque.
Proposal
Two independent extensions, in rough priority order:
- Array elements. Treat an array of scalars as a multiset of its elements
and run the dictionary over the flattened elements (cardinality, common
values). Decide handling for arrays of objects.
- Nested objects. Flatten
meta.x into dotted keys and dictionary those
too. Watch for key explosion and output growth on deeply nested data.
Tradeoffs
Captured in the inspector-layers spec, Open Question 4 (resolved: scalars only
for the first cut, this deferred to a follow-up):
- Scalars only — simple, bounded, but misses the common tag case.
- Array-element characterization — makes tag/label fields legible; adds a second
aggregation mode; ambiguous for arrays of objects.
- Nested-object recursion — full coverage, but risks key explosion and thin
signal for deep nesting.
Notes
Summary
The
object_fieldsmeasurement primitive (seeproduct/specs/inspector-layers-spec.md)builds a per-field data dictionary over a set of frontmatter objects: presence,
type histogram, cardinality, and the most common values.
Its first cut characterizes scalar values only (keeping string and numeric
scalars distinct). Array and nested-object values are counted as present and
typed, but not characterized further. This issue tracks deepening that.
Motivation
Array fields are common in wikis —
tags: [a, b, c]is the obvious case — andtoday they report only "present, array," missing the element-value distribution
an agent would want to spot an enum-of-tags. Nested objects (
meta: {…}) aresimilarly opaque.
Proposal
Two independent extensions, in rough priority order:
and run the dictionary over the flattened elements (cardinality, common
values). Decide handling for arrays of objects.
meta.xinto dotted keys and dictionary thosetoo. Watch for key explosion and output growth on deeply nested data.
Tradeoffs
Captured in the inspector-layers spec, Open Question 4 (resolved: scalars only
for the first cut, this deferred to a follow-up):
aggregation mode; ambiguous for arrays of objects.
signal for deep nesting.
Notes
object_fieldsprimitive first.