Skip to content

Add JSON Lines (JSONL) input mode#283

Draft
Copilot wants to merge 4 commits intomasterfrom
copilot/add-json-lines-support
Draft

Add JSON Lines (JSONL) input mode#283
Copilot wants to merge 4 commits intomasterfrom
copilot/add-json-lines-support

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 12, 2026

Adds jsonl as a new input mode, enabling GoAWK to process JSON Lines format where each line is a JSON value. Complements the existing CSV/TSV modes.

New mode: INPUTMODE="jsonl" / -i jsonl

JSON arrays → positional fields $1, $2, ...

# Input: ["Gilbert", "2013", 24, true]
BEGIN { INPUTMODE="jsonl" } { print $1, $2, $3, $4 }
# → Gilbert 2013 24 1

JSON objects → named fields via @"key", also accessible as $1, $2, ... in document key order; FIELDS array updated per record

# Input: {"version":"1.1","host":"example.org","short_message":"A log message"}
BEGIN { INPUTMODE="jsonl" } { print @"host", @"short_message" }
# → example.org A log message

Nested objects and arrays are flattened using dot notation for objects and numeric indexes for arrays:

# Input: {"one": 1, "four": [1,2,3,4], "five": {"alpha": ["fo", "fum"], "beta": {"hey": "How's tricks?"}}}
BEGIN { INPUTMODE="jsonl" } { print @"four.0", @"five.alpha.1", @"five.beta.hey" }
# → 1 fum How's tricks?

Accessing a non-scalar key (e.g. @"four" where four is an array) returns "". Positional fields $1, $2, ... and the FIELDS array reflect all flattened scalar leaves in document order.

Type conversions

JSON AWK
true "1"
false "0"
null ""
number numeric string
string string (unescaped)

Implementation

  • interp/jsonl.go (new): jsonlSplitter (mirrors csvSplitter — pre-parses each line before action bodies run, so FIELDS and per-record state are ready), parseJSONLineToFields(), flattenObject() / flattenArray() / flattenJSONValue() for recursive flattening, and JSON→AWK conversion helpers
  • interp/interp.go: JSONLMode IOMode = 3; getFieldByName() calls ensureFields() first in JSONL mode (per-record object keys, unlike CSV's fixed header); updated parseInputMode() / inputModeString()
  • interp/io.go: newScanner() uses jsonlSplitter for JSONLMode; ensureFields() re-parses from $0 on reassignment; setFieldNames(nil) correctly clears FIELDS for array records
  • Empty lines are skipped; $0 holds the raw JSON line; $0 reassignment re-parses as JSON

Copilot AI linked an issue Apr 12, 2026 that may be closed by this pull request
Copilot AI and others added 2 commits April 12, 2026 20:19
Agent-Logs-Url: https://github.com/benhoyt/goawk/sessions/29bc6e5b-0551-41a0-83cf-9d679344b4d0

Co-authored-by: benhoyt <999033+benhoyt@users.noreply.github.com>
…ing, better error message

Agent-Logs-Url: https://github.com/benhoyt/goawk/sessions/29bc6e5b-0551-41a0-83cf-9d679344b4d0

Co-authored-by: benhoyt <999033+benhoyt@users.noreply.github.com>
Copilot AI changed the title [WIP] Add JSON Lines support for GoAWK Add JSON Lines (JSONL) input mode Apr 12, 2026
Copilot AI requested a review from benhoyt April 12, 2026 20:21
@benhoyt
Copy link
Copy Markdown
Owner

benhoyt commented Apr 12, 2026

@copilot thanks. For nested objects and stats, I'd much rather these were flattened to @"foo.nested" or @"foo.0". Can you please try that?

Copy link
Copy Markdown
Author

Copilot AI commented Apr 12, 2026

@copilot thanks. For nested objects and stats, I'd much rather these were flattened to @"foo.nested" or @"foo.0". Can you please try that?

Done in 9612689. Nested JSON values are now flattened recursively using dot notation:

  • Nested object key foo.bar@"foo.bar"
  • Array element at index i under key foo@"foo.0", @"foo.1", etc.
  • This works at any nesting depth: @"five.alpha.1", @"five.beta.hey", etc.

For example:

{"one": 1, "four": [1,2,3,4], "five": {"alpha": ["fo", "fum"], "beta": {"hey": "How's tricks?"}}}

gives @"four.0"1, @"five.alpha.1"fum, @"five.beta.hey"How's tricks?

Accessing @"foo" where foo is a non-scalar key returns "" (empty string, not an error). Positional fields $1, $2, ... and FIELDS reflect all flattened scalar leaves in document order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add JSON Lines support

2 participants