[ BOUNTY] Add JSONL output to log_aggregator (#3)#7
Conversation
- Add --format text|jsonl with text as default (was json) - JSONL records include timestamp, level, source, message, metadata - Sort entries by timestamp when available - Warning records for unparseable lines - Fix string timestamp handling in _parse_line - Make TextLogParser require timestamp or level keyword - Add tests covering JSON and text log formats - Document JSONL schema in docs/OPERATIONS.md
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds JSONL export capability to ChangesJSONL Export Feature
Diagnostic Build Report
Sequence Diagram(s)sequenceDiagram
actor User
participant CLI as parse_args / main()
participant LogAggregator
participant TextLogParser
participant OutputFile
User->>CLI: python3 log_aggregator.py --format jsonl
CLI->>LogAggregator: process_file(input_path)
loop each line
LogAggregator->>TextLogParser: parse(line)
TextLogParser-->>LogAggregator: entry or None
alt parsed entry
LogAggregator->>LogAggregator: normalize ISO timestamp → epoch
else unparseable
LogAggregator->>LogAggregator: append to unparseable_lines
end
end
CLI->>LogAggregator: export_jsonl(output_path)
LogAggregator->>LogAggregator: sort entries by timestamp
loop parsed entries
LogAggregator->>OutputFile: write JSONL record
end
loop unparseable_lines
LogAggregator->>OutputFile: write warning JSONL record with raw_line
end
OutputFile-->>User: .jsonl file
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@diagnostic/build-338f0f7a.json`:
- Line 5: The diagnostic JSON file contains a non-null diagnostic_logd_error
field indicating the encryptly binary is missing on windows-x64, which will
trigger a CI validation failure in the diagnostic-build-log.yml workflow that
explicitly rejects files with this error. To fix this, either re-run the python3
build.py script on a platform where the encryptly binary is available (such as
linux-x64 or macos-x64) to generate a valid diagnostic file with the encrypted
.logd file present, or remove the diagnostic file entirely from the commit if
the JSONL feature does not require a passing build diagnostic.
In `@tests/test_log_aggregator_jsonl.py`:
- Line 58: Replace all four instances of tempfile.mktemp(suffix=".jsonl") with a
secure temporary file API. Instead of using mktemp(), use
NamedTemporaryFile(delete=False) to create a named temporary file with the
.jsonl suffix, then extract the file path from the returned object. This applies
to lines 58, 77, 96, and 112 in the test file. Ensure that you properly handle
the file object returned by NamedTemporaryFile and close it if needed before
using the path.
In `@tools/log_aggregator.py`:
- Around line 254-263: In the timestamp parsing logic within the try-except
block, when catching ValueError or TypeError exceptions, set both ts = None and
entry['timestamp'] = None to ensure the timestamp is properly normalized to None
instead of remaining as an invalid string. Additionally, change the conditional
check from if ts and isinstance(ts, (int, float)): to if ts is not None and
isinstance(ts, (int, float)): to properly handle valid epoch 0 timestamps, which
are falsy but should not be skipped.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 52b35326-daa8-4faf-b8ec-8e41ad73dd16
📒 Files selected for processing (4)
diagnostic/build-338f0f7a.jsondocs/OPERATIONS.mdtests/test_log_aggregator_jsonl.pytools/log_aggregator.py
| "generated_at": "2026-06-18T23:12:41.974269+00:00", | ||
| "commit": "338f0f7a", | ||
| "diagnostic_logd": null, | ||
| "diagnostic_logd_error": "encryptly binary not found (detected windows-x64; available: linux-arm64, linux-x64, macos-arm64, macos-x64, windows-arm64, windows-x64); cannot create diagnostic\\build-338f0f7a.logd", |
There was a problem hiding this comment.
CI validation blocker: diagnostic_logd_error will cause PR to fail CI checks.
The diagnostic report includes diagnostic_logd_error (line 5) and message_blocker (line 6), indicating the encryptly preflight build failed due to a missing encryptly binary on windows-x64. However, the CI workflow (.github/workflows/diagnostic-build-log.yml lines 160–162) explicitly rejects any diagnostic JSON file with a non-null diagnostic_logd_error:
if diagnostic_logd_error:
failures.append((json_path, f"Build script reported diagnostic_logd_error: {diagnostic_logd_error}"))
continue
This will cause the PR's CI validation to fail, blocking merge. The diagnostic artifact requirements mandate that a valid encrypted .logd file be present in the commit. To proceed, either:
- Re-run
python3 build.pyon a platform where the encryptly binary is available (linux-x64, macos-x64, etc.), or - Remove this diagnostic file from the PR commit if the JSONL feature itself does not require a passing build diagnostic.
Also applies to: 6-6
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@diagnostic/build-338f0f7a.json` at line 5, The diagnostic JSON file contains
a non-null diagnostic_logd_error field indicating the encryptly binary is
missing on windows-x64, which will trigger a CI validation failure in the
diagnostic-build-log.yml workflow that explicitly rejects files with this error.
To fix this, either re-run the python3 build.py script on a platform where the
encryptly binary is available (such as linux-x64 or macos-x64) to generate a
valid diagnostic file with the encrypted .logd file present, or remove the
diagnostic file entirely from the commit if the JSONL feature does not require a
passing build diagnostic.
- Replace tempfile.mktemp() with mkstemp-based _temp_output_path helper - Use 'is not None' instead of truthy check for timestamp (handles epoch 0) - Normalize invalid string timestamps to None instead of leaving as string - All 6 tests still passing
* Add benchmark rate limiter bypass flag * Add benchmark diagnostic report * Remove incomplete benchmark diagnostic report * Add benchmark build diagnostics
Summary
Add
--format text|jsonloutput support totools/log_aggregator.pyso downstream tooling can parse aggregated logs without scraping terminal text.textis the new default;jsonlemits one JSON record per line.Changes
tools/log_aggregator.py:--formatchoices fromjson|csv|htmltotext|jsonl|json|csv|htmlwithtextas defaultexport_jsonl()method that writes one JSON record per line withtimestamp,level,source,message, andmetadatafields_sorted_entries_for_jsonl()to sort entries by timestamp (oldest first); entries without timestamps appear after timestamped ones_entry_to_jsonl_record()to map internal entry fields to the JSONL schema (ISO 8601 timestamps,servicemapped tosource)unparseable_lineslist; JSONL output emits a warning record for each_parse_line()— JSON logs with ISO 8601 string timestamps now convert to Unix timestamps instead of crashingTextLogParserrequire at least a timestamp or a log level keyword; lines with no recognizable structure are now treated as unparseabletests/test_log_aggregator_jsonl.py(new file):test_jsonl_output_with_json_logs— JSONL output from JSON-format logs, verifies all required fieldstest_jsonl_output_with_text_logs— JSONL output from text-format logs, verifies all required fieldstest_jsonl_sorted_by_timestamp— verifies chronological orderingtest_unparseable_lines_produce_warning_records— verifies warning records for unparseable linestest_text_format_is_default— verifies default format istexttest_jsonl_format_choice_exists— verifiesjsonlis a valid format choicedocs/OPERATIONS.md:Testing
Result:
Build diagnostic:
diagnostic/build-338f0f7a.json(encryptly .logd unavailable on Windows; JSON metadata included).Checklist
Closes #3
Summary by CodeRabbit
--formatoptions and changed the default output to text.