Hyoka bug fixes by LarryOsterman · Pull Request #640 · ronniegeraghty/hyoka

LarryOsterman · 2026-05-01T22:46:26Z

No description provided.

Manually port the logical changes from Larry's branch without dragging the gofmt indentation churn that made up ~95% of his diff: 1. workspace/delta.go: TakeSnapshot now skips build-artifact directories (target/, node_modules/, bin/, dist/, etc.) via utils.IsDefaultExcludedDir. Generated files were overwhelming reviewer prompts. 2. review/{types,reviewer,buckets}.go: ReviewResult gains a SkippedReviewers []SkippedReviewer field, populated by both ReviewPanel and ReviewPanelBuckets when a reviewer model fails. Users can now see which reviewers were skipped and why. 3. eval/copilot.go: Fixed two sites still using e.maxSessionActions instead of the per-eval maxSessionActionsLimit (already in scope from line 262, already used correctly at line 361). Also added counting for assistant.reasoning events — per Ronnie's directive, every action the agent takes (reasoning, tool calls, responses) counts toward the action limit, not just tool invocations. Tests added (Switch): - TestTakeSnapshot_BuildArtifactDirsExcluded - TestReviewPanel_SkippedReviewers (+ JSON marshaling test) - TestReviewPanelBuckets_SkippedReviewers - Per-eval limit + assistant.reasoning counter tests in eval package Refs: #640 Co-authored-by: Larry Osterman <larryo@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Merged .squad/decisions/inbox/ (neo-pr640-port.md, switch-pr640-tests.md) into decisions.md - Deleted inbox files after merge - Added cross-agent notes to Oracle, Morpheus, Trinity, Tank history files: * Oracle: action semantics clarification (assistant.reasoning counts as action) * Morpheus: architectural implications of action counting * Trinity: SkippedReviewers field now surfaced in ReviewResult * Tank: build artifacts excluded from workspace snapshots - decisions.md now 73KB (no archival needed; no entries >30 days old) Commit: 703f638 (PR #640 port) — all 3 bugs fixed, 7 tests passing, all systems green.

ronniegeraghty · 2026-05-05T19:01:04Z

Hey @LarryOsterman — thanks for the fixes! I've ported the three logical changes from this branch to ronniegeraghty/dev as commit 703f638:

Build-artifact dir skip in workspace/delta.go (utils.IsDefaultExcludedDir check inside the walk callback) — plus your TestTakeSnapshot_BuildArtifactDirsExcluded.
SkippedReviewers field on ReviewResult, populated by both ReviewPanel and ReviewPanelBuckets.
Action counter now uses the per-eval maxSessionActionsLimit at lines 600 and 622 (previously stuck on e.maxSessionActions).

One addition on top of yours: I also added assistant.reasoning as a counted event type in the action switch. The semantic we're going with is every agent action — reasoning, tool calls, bash, responses — counts against the same limit, not just tool invocations. Switch wrote tests covering per-eval limits, the new event type, and mixed-event counting.

I skipped the engine_eval.go portion of your diff since it was indentation-only churn (your editor reformatted on save). Cherry-picking the whole commit would have dragged ~170 lines of whitespace conflicts; the manual port keeps the diff minimal and surgical.

All tests green on the touched packages. Closing this PR isn't required from your end — just wanted to acknowledge the carry-over and keep you in the loop. Appreciate the bug hunting!

…matching, and inline graders - PR #640 Fixes: Generated files exclusion, skipped reviewers reporting, action counter for reasoning+tool_execution_start - Batch 2 (Per-property when negation): MatchSet polymorphic type, tags: matching with case-insensitive semantics - Batch 3 (Inline graders): graders: field on prompt files, breaking change removal of evaluation_criteria: from YAML Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Document that actions include assistant.reasoning events, tool.execution_start events, tool calls, bash commands, and responses. Update description to match PR #640 fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- CHANGELOG: Entries for PR #640, MatchSet/tags:, inline graders - docs/prompt-authoring.md: graders:, tags:, inline graders guide - docs/graders/index.md: MatchSet polymorphic type, negation, tags: semantics - docs/graders/activity.md: Action counter semantics (reasoning + tools) - docs/architecture.md: Grader execution order, action counter - examples/prompts/inline-graders-example.prompt.md: Markdown format example Coverage: 100% complete. 0 gaps. 6 commits in dev. Build green. Authored-by: Oracle Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Hyoka bug fixes

b0134e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyoka bug fixes#640

Hyoka bug fixes#640
LarryOsterman wants to merge 1 commit into
ronniegeraghty/devfrom
larryo/for_ronnie

LarryOsterman commented May 1, 2026

Uh oh!

ronniegeraghty commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LarryOsterman commented May 1, 2026

Uh oh!

ronniegeraghty commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants