[ BOUNTY] Add JSONL output to log_aggregator (#3) by xcapselx · Pull Request #7 · thanhle74/kickama

xcapselx · 2026-06-18T23:13:14Z

Summary

Add --format text|jsonl output support to tools/log_aggregator.py so downstream tooling can parse aggregated logs without scraping terminal text. text is the new default; jsonl emits one JSON record per line.

Changes

tools/log_aggregator.py:
- Changed --format choices from json|csv|html to text|jsonl|json|csv|html with text as default
- Added export_jsonl() method that writes one JSON record per line with timestamp, level, source, message, and metadata fields
- Added _sorted_entries_for_jsonl() to sort entries by timestamp (oldest first); entries without timestamps appear after timestamped ones
- Added _entry_to_jsonl_record() to map internal entry fields to the JSONL schema (ISO 8601 timestamps, service mapped to source)
- Track unparseable lines in unparseable_lines list; JSONL output emits a warning record for each
- Fixed string timestamp handling in _parse_line() — JSON logs with ISO 8601 string timestamps now convert to Unix timestamps instead of crashing
- Made TextLogParser require at least a timestamp or a log level keyword; lines with no recognizable structure are now treated as unparseable
tests/test_log_aggregator_jsonl.py (new file):
- test_jsonl_output_with_json_logs — JSONL output from JSON-format logs, verifies all required fields
- test_jsonl_output_with_text_logs — JSONL output from text-format logs, verifies all required fields
- test_jsonl_sorted_by_timestamp — verifies chronological ordering
- test_unparseable_lines_produce_warning_records — verifies warning records for unparseable lines
- test_text_format_is_default — verifies default format is text
- test_jsonl_format_choice_exists — verifies jsonl is a valid format choice
docs/OPERATIONS.md:
- Added "Log Aggregator JSONL Output" section documenting the JSONL record schema, ordering rules, warning record format, and supported input formats

Testing

python -m unittest tests.test_log_aggregator_jsonl -v

Result:

test_jsonl_format_choice_exists ... ok
test_jsonl_output_with_json_logs ... ok
test_jsonl_output_with_text_logs ... ok
test_jsonl_sorted_by_timestamp ... ok
test_text_format_is_default ... ok
test_unparseable_lines_produce_warning_records ... ok

Ran 6 tests in 0.011s
OK

Build diagnostic: diagnostic/build-338f0f7a.json (encryptly .logd unavailable on Windows; JSON metadata included).

Checklist

Relevant modules affected by these changes build locally
Tests pass locally
Diagnostic build log is committed in this PR
Documentation has been updated (docs/OPERATIONS.md)
Changes are scoped to the PR purpose and avoid unrelated cleanup
Security, privacy, and error-handling implications have been considered

Closes #3

Summary by CodeRabbit

New Features
- Added JSONL export support to the log aggregator for machine-readable downstream consumption.
- Expanded the CLI --format options and changed the default output to text.
Bug Fixes
- Improve parsing behavior by skipping lines without detectable timestamps/valid levels.
- JSONL export now includes warning records for unparseable lines and outputs records in timestamp order.
Documentation
- Updated operations guide with JSONL schema, usage, and supported input formats.
Tests
- Added unit tests covering JSONL export, sorting, warnings, and CLI format defaults.

- Add --format text|jsonl with text as default (was json) - JSONL records include timestamp, level, source, message, metadata - Sort entries by timestamp when available - Warning records for unparseable lines - Fix string timestamp handling in _parse_line - Make TextLogParser require timestamp or level keyword - Add tests covering JSON and text log formats - Document JSONL schema in docs/OPERATIONS.md

coderabbitai · 2026-06-18T23:13:26Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 007eb786-e5cd-4cba-bee5-2e58d9c42f7e

📥 Commits

Reviewing files that changed from the base of the PR and between 3edf7c9 and 8385927.

📒 Files selected for processing (2)

tests/test_log_aggregator_jsonl.py
tools/log_aggregator.py

🚧 Files skipped from review as they are similar to previous changes (2)

tools/log_aggregator.py
tests/test_log_aggregator_jsonl.py

📝 Walkthrough

Walkthrough

Adds JSONL export capability to LogAggregator: TextLogParser.parse() gains a pre-filter to skip lines with neither timestamp nor level; _parse_line normalizes ISO timestamps to epoch and tracks unparseable lines; export_jsonl() writes ordered JSONL records plus warning entries. CLI --format gains jsonl/text choices with text as the new default. A test suite and OPERATIONS.md section document the feature. A diagnostic JSON report for a failed encryptly-preflight build is also included.

Changes

JSONL Export Feature

Layer / File(s)	Summary
Parser pre-filter and aggregator internals `tools/log_aggregator.py`	`TextLogParser.parse()` returns `None` when a line has neither timestamp nor recognizable level. `LogAggregator.__init__` adds `unparseable_lines` list. `_parse_line` converts ISO-8601 string timestamps to epoch integers for hourly bucketing and appends stripped non-parsed lines to `unparseable_lines`.
`export_jsonl()` method and CLI wiring `tools/log_aggregator.py`	`export_jsonl()` writes timestamp-ordered JSONL records for parsed entries and appends warning records with `raw_line` for each unparseable item. `parse_args()` expands `--format` to `["text", "jsonl", "json", "csv", "html"]` with `text` as the new default. `main()` adds explicit `jsonl`/`json` branches and replaces the old JSON fallback with `pass`.
Tests and documentation `tests/test_log_aggregator_jsonl.py`, `docs/OPERATIONS.md`	`TestJSONLOutput` covers JSON log export, text log export, timestamp-sorted output, warning records for unparseable lines, default format, and `--format jsonl` CLI choice. `OPERATIONS.md` documents the JSONL record schema, ordering rules, warning record structure, and supported input formats.

Diagnostic Build Report

Layer / File(s)	Summary
Diagnostic JSON build report `diagnostic/build-338f0f7a.json`	New report records the `encryptly-preflight` module `FAIL` status due to a missing `encryptly` binary on `windows-x64` for commit `338f0f7a`, with pass/fail counts and a PR note about the absent `.logd` artifact.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as parse_args / main()
    participant LogAggregator
    participant TextLogParser
    participant OutputFile

    User->>CLI: python3 log_aggregator.py --format jsonl
    CLI->>LogAggregator: process_file(input_path)
    loop each line
        LogAggregator->>TextLogParser: parse(line)
        TextLogParser-->>LogAggregator: entry or None
        alt parsed entry
            LogAggregator->>LogAggregator: normalize ISO timestamp → epoch
        else unparseable
            LogAggregator->>LogAggregator: append to unparseable_lines
        end
    end
    CLI->>LogAggregator: export_jsonl(output_path)
    LogAggregator->>LogAggregator: sort entries by timestamp
    loop parsed entries
        LogAggregator->>OutputFile: write JSONL record
    end
    loop unparseable_lines
        LogAggregator->>OutputFile: write warning JSONL record with raw_line
    end
    OutputFile-->>User: .jsonl file

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 Hoppity-hop, I parsed the logs today,
JSON lines sorted, warnings on display!
Unparseable crumbs? I warned them with care,
A --format jsonl flag floats in the air.
The diagnostic said "no binary found" —
But the JSONL export is safe and sound! 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately describes the main change: adding JSONL output functionality to the log_aggregator tool, which is the primary focus of this changeset.
Description check	✅ Passed	The PR description comprehensively covers all required template sections including Summary, Changes, Testing, and a completed Checklist with all relevant items marked. Documentation updates and diagnostic build logs are included as required.
Linked Issues check	✅ Passed	All acceptance criteria from issue `#3` are met: JSONL feature fully implemented with export_jsonl() method and supporting functions, existing tests pass plus 6 new tests added and passing, code follows conventions, and diagnostic artifacts included.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the PR purpose of adding JSONL output support. Changes include the log_aggregator implementation, comprehensive tests, documentation updates, and build diagnostics—no unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@diagnostic/build-338f0f7a.json`:
- Line 5: The diagnostic JSON file contains a non-null diagnostic_logd_error
field indicating the encryptly binary is missing on windows-x64, which will
trigger a CI validation failure in the diagnostic-build-log.yml workflow that
explicitly rejects files with this error. To fix this, either re-run the python3
build.py script on a platform where the encryptly binary is available (such as
linux-x64 or macos-x64) to generate a valid diagnostic file with the encrypted
.logd file present, or remove the diagnostic file entirely from the commit if
the JSONL feature does not require a passing build diagnostic.

In `@tests/test_log_aggregator_jsonl.py`:
- Line 58: Replace all four instances of tempfile.mktemp(suffix=".jsonl") with a
secure temporary file API. Instead of using mktemp(), use
NamedTemporaryFile(delete=False) to create a named temporary file with the
.jsonl suffix, then extract the file path from the returned object. This applies
to lines 58, 77, 96, and 112 in the test file. Ensure that you properly handle
the file object returned by NamedTemporaryFile and close it if needed before
using the path.

In `@tools/log_aggregator.py`:
- Around line 254-263: In the timestamp parsing logic within the try-except
block, when catching ValueError or TypeError exceptions, set both ts = None and
entry['timestamp'] = None to ensure the timestamp is properly normalized to None
instead of remaining as an invalid string. Additionally, change the conditional
check from if ts and isinstance(ts, (int, float)): to if ts is not None and
isinstance(ts, (int, float)): to properly handle valid epoch 0 timestamps, which
are falsy but should not be skipped.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52b35326-daa8-4faf-b8ec-8e41ad73dd16

📥 Commits

Reviewing files that changed from the base of the PR and between 94e0fb0 and 3edf7c9.

📒 Files selected for processing (4)

diagnostic/build-338f0f7a.json
docs/OPERATIONS.md
tests/test_log_aggregator_jsonl.py
tools/log_aggregator.py

coderabbitai · 2026-06-18T23:16:58Z

+  "generated_at": "2026-06-18T23:12:41.974269+00:00",
+  "commit": "338f0f7a",
+  "diagnostic_logd": null,
+  "diagnostic_logd_error": "encryptly binary not found (detected windows-x64; available: linux-arm64, linux-x64, macos-arm64, macos-x64, windows-arm64, windows-x64); cannot create diagnostic\\build-338f0f7a.logd",


⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

CI validation blocker: diagnostic_logd_error will cause PR to fail CI checks.

The diagnostic report includes diagnostic_logd_error (line 5) and message_blocker (line 6), indicating the encryptly preflight build failed due to a missing encryptly binary on windows-x64. However, the CI workflow (.github/workflows/diagnostic-build-log.yml lines 160–162) explicitly rejects any diagnostic JSON file with a non-null diagnostic_logd_error:

if diagnostic_logd_error: failures.append((json_path, f"Build script reported diagnostic_logd_error: {diagnostic_logd_error}")) continue

This will cause the PR's CI validation to fail, blocking merge. The diagnostic artifact requirements mandate that a valid encrypted .logd file be present in the commit. To proceed, either:

Re-run python3 build.py on a platform where the encryptly binary is available (linux-x64, macos-x64, etc.), or

Remove this diagnostic file from the PR commit if the JSONL feature itself does not require a passing build diagnostic.

Also applies to: 6-6

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@diagnostic/build-338f0f7a.json` at line 5, The diagnostic JSON file contains a non-null diagnostic_logd_error field indicating the encryptly binary is missing on windows-x64, which will trigger a CI validation failure in the diagnostic-build-log.yml workflow that explicitly rejects files with this error. To fix this, either re-run the python3 build.py script on a platform where the encryptly binary is available (such as linux-x64 or macos-x64) to generate a valid diagnostic file with the encrypted .logd file present, or remove the diagnostic file entirely from the commit if the JSONL feature does not require a passing build diagnostic.

- Replace tempfile.mktemp() with mkstemp-based _temp_output_path helper - Use 'is not None' instead of truthy check for timestamp (handles epoch 0) - Normalize invalid string timestamps to None instead of leaving as string - All 6 tests still passing

* Add benchmark rate limiter bypass flag * Add benchmark diagnostic report * Remove incomplete benchmark diagnostic report * Add benchmark build diagnostics

LE-VAI added 2 commits June 18, 2026 19:12

Add build diagnostics for 338f0f7

3edf7c9

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

thanhle74 pushed a commit that referenced this pull request Jun 20, 2026

Add benchmark rate limiter bypass flag (#7)

89a2197

* Add benchmark rate limiter bypass flag * Add benchmark diagnostic report * Remove incomplete benchmark diagnostic report * Add benchmark build diagnostics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ BOUNTY] Add JSONL output to log_aggregator (#3)#7

[ BOUNTY] Add JSONL output to log_aggregator (#3)#7
xcapselx wants to merge 3 commits into
thanhle74:mainfrom
xcapselx:feat/jsonl-output-thanhle74

xcapselx commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xcapselx commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xcapselx commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading