Skip to content

feat: add health snapshot reporting + CI artifacts#3

Merged
GopiB9119 merged 3 commits into
mainfrom
feat/health-snapshot-reporting
Mar 5, 2026
Merged

feat: add health snapshot reporting + CI artifacts#3
GopiB9119 merged 3 commits into
mainfrom
feat/health-snapshot-reporting

Conversation

@GopiB9119

Copy link
Copy Markdown
Owner

Summary

  • Add scripts/health-snapshot.js to run JS syntax checks + Python agent unit tests and write timestamped reports.
  • Add npm scripts:
    • health:snapshot
    • health:snapshot:quiet
  • Extend python-agent-tests workflow to:
    • setup Node + install npm deps
    • run health snapshot in quiet mode
    • upload logs/health/*.md artifacts
  • Update README with usage and retention behavior.

Validation

  • node --check scripts/health-snapshot.js
  • cmd /c npm run health:snapshot:quiet
  • npm run agent:test:py

Copilot AI review requested due to automatic review settings March 5, 2026 10:35
@GopiB9119 GopiB9119 merged commit 212ba58 into main Mar 5, 2026
4 checks passed
@GopiB9119 GopiB9119 deleted the feat/health-snapshot-reporting branch March 5, 2026 10:37

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “health snapshot” script to generate timestamped health reports (JS syntax checks + Python agent unit tests), wires it into npm scripts, and extends CI workflows to run/report these checks and adjust evaluation gating behavior.

Changes:

  • Add scripts/health-snapshot.js to run Node --check across key JS files + Python unit tests and emit logs/health/health-snapshot-*.md reports with retention pruning.
  • Add npm scripts health:snapshot and health:snapshot:quiet.
  • Update CI: run the health snapshot in python-agent-tests and upload the generated markdown reports as artifacts; refine eval-gate PR vs manual gating.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
scripts/health-snapshot.js New report generator for JS syntax + Python unit test health snapshots with pruning.
package.json Adds npm scripts to run the health snapshot (normal/quiet).
README.md Documents health snapshot usage and report retention behavior.
.github/workflows/python-agent-tests.yml Adds Node setup + health snapshot + artifact upload; changes workflow triggers/concurrency.
.github/workflows/eval-gate.yml Adds branch scoping + concurrency; splits PR baseline gate from manual strict gate.

Comment on lines +15 to +23
if (process.env.AGENT_PYTHON && String(process.env.AGENT_PYTHON).trim()) {
candidates.push({ cmd: String(process.env.AGENT_PYTHON).trim(), argsPrefix: [] });
}

const winVenvPython = path.resolve(process.cwd(), '.venv', 'Scripts', 'python.exe');
if (fs.existsSync(winVenvPython)) {
candidates.push({ cmd: winVenvPython, argsPrefix: [] });
}

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pythonCommandCandidates() only checks a Windows venv path and then falls back to python/py -3. On Linux/macOS it’s common to have .venv/bin/python and/or only python3 available, so health:snapshot can fail locally even when a venv exists. Add POSIX venv + python3 candidates (and keep the existing Windows/AGENT_PYTHON logic).

Suggested change
if (process.env.AGENT_PYTHON && String(process.env.AGENT_PYTHON).trim()) {
candidates.push({ cmd: String(process.env.AGENT_PYTHON).trim(), argsPrefix: [] });
}
const winVenvPython = path.resolve(process.cwd(), '.venv', 'Scripts', 'python.exe');
if (fs.existsSync(winVenvPython)) {
candidates.push({ cmd: winVenvPython, argsPrefix: [] });
}
const cwd = process.cwd();
if (process.env.AGENT_PYTHON && String(process.env.AGENT_PYTHON).trim()) {
candidates.push({ cmd: String(process.env.AGENT_PYTHON).trim(), argsPrefix: [] });
}
if (process.platform === 'win32') {
const winVenvPython = path.resolve(cwd, '.venv', 'Scripts', 'python.exe');
if (fs.existsSync(winVenvPython)) {
candidates.push({ cmd: winVenvPython, argsPrefix: [] });
}
} else {
const posixVenvPython = path.resolve(cwd, '.venv', 'bin', 'python');
if (fs.existsSync(posixVenvPython)) {
candidates.push({ cmd: posixVenvPython, argsPrefix: [] });
}
const posixVenvPython3 = path.resolve(cwd, '.venv', 'bin', 'python3');
if (fs.existsSync(posixVenvPython3)) {
candidates.push({ cmd: posixVenvPython3, argsPrefix: [] });
}
}
candidates.push({ cmd: 'python3', argsPrefix: [] });

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +77
child.on('close', (code) => {
resolve({
ok: code === 0,
code: code || 0,
stdout,
stderr,
startedAt,
finishedAt: new Date(),
command: [command, ...args].join(' ')

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the close handler, code can be null when the process exits due to a signal. Using code: code || 0 will record an exit code of 0 even though ok is false, which makes the report misleading. Capture the (code, signal) parameters and set a non-zero code when code is null (and ideally include the signal in the recorded error output/metadata).

Suggested change
child.on('close', (code) => {
resolve({
ok: code === 0,
code: code || 0,
stdout,
stderr,
startedAt,
finishedAt: new Date(),
command: [command, ...args].join(' ')
child.on('close', (code, signal) => {
const finishedAt = new Date();
const exitCode = typeof code === 'number' ? code : 1;
let finalStderr = stderr;
if (signal) {
const signalMessage = `Process terminated by signal ${signal}`;
finalStderr = finalStderr ? `${finalStderr}\n${signalMessage}` : signalMessage;
}
resolve({
ok: exitCode === 0,
code: exitCode,
stdout,
stderr: finalStderr,
startedAt,
finishedAt,
command: [command, ...args].join(' '),
signal: signal || null

Copilot uses AI. Check for mistakes.
Comment on lines +172 to +187
for (const file of files) {
const result = await runCommand(process.execPath, ['--check', file], { streamOutput: !IS_QUIET });
outputs.push(`$ node --check ${file}`);
if (result.stdout) outputs.push(result.stdout.trim());
if (result.stderr) outputs.push(result.stderr.trim());

if (!result.ok) {
return {
ok: false,
code: result.code,
stdout: outputs.filter(Boolean).join('\n\n'),
stderr: result.stderr || '',
startedAt,
finishedAt: new Date(),
command: 'node --check <repo-js-files>'
};

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runJsCheck() appends result.stderr into stdout (via outputs) and also returns it separately as stderr. Since formatSection() concatenates stdout and stderr, JS check failures will duplicate the same stderr content in the report. Return either a single combined output, or keep stdout/stderr separate (but don’t include stderr in both).

Copilot uses AI. Check for mistakes.
Comment on lines 3 to +9
on:
push:
paths:
- "agent/**"
- ".github/workflows/python-agent-tests.yml"
branches:
- main
pull_request:
paths:
- "agent/**"
- ".github/workflows/python-agent-tests.yml"
branches:
- main

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This workflow no longer uses paths: filters and will run for every PR targeting main, even when the Python agent isn’t touched. If the intent is to keep this scoped to agent/** (as the README currently states), consider restoring paths: (possibly in addition to branches:) to avoid unnecessary CI time.

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines +94 to +100
npm run health:snapshot
npm run health:snapshot:quiet
```

`npm run health:snapshot` runs JS syntax checks plus Python agent unit tests and writes a timestamped report to `logs/health/`.
Use `npm run health:snapshot:quiet` for CI-friendly minimal console output while preserving full report details.
Report retention is automatic: by default the latest `20` reports are kept in `logs/health/`. Override with `HEALTH_SNAPSHOT_KEEP_LATEST`.

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README says GitHub Actions runs the Python agent tests automatically for agent/** changes, but python-agent-tests.yml now triggers on all PRs to main (no paths: filter). Update the README statement to match the new workflow behavior, or re-scope the workflow to agent/** changes.

Copilot uses AI. Check for mistakes.
npm run eval:collect
npm run eval:run
npm run eval:history
node -e "const fs=require('fs');const r=JSON.parse(fs.readFileSync('eval/report.json','utf8'));const score=Number((r.summary||{}).overallScore||0);if(score<40){console.error('[EvalBaselineGate] FAIL overallScore',score,'< 40');process.exit(1)};console.log('[EvalBaselineGate] PASS overallScore='+score);"

Copilot AI Mar 5, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR baseline gate uses an inline node -e ... snippet with a hard-coded threshold (< 40) to parse eval/report.json. This duplicates logic already implemented in scripts/eval/ci-gate.js (npm run eval:gate) and makes future changes harder. Consider reusing the existing gate script and setting EVAL_GATE_MIN_SCORE=40 (and any other needed env flags) for the PR baseline case instead of embedding custom parsing here.

Suggested change
node -e "const fs=require('fs');const r=JSON.parse(fs.readFileSync('eval/report.json','utf8'));const score=Number((r.summary||{}).overallScore||0);if(score<40){console.error('[EvalBaselineGate] FAIL overallScore',score,'< 40');process.exit(1)};console.log('[EvalBaselineGate] PASS overallScore='+score);"
npm run eval:gate
env:
EVAL_GATE_MIN_SCORE: '40'

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants