Test cleanup, shared LLM-JSON parser, and self-play persistence fix by chatman-media · Pull Request #3 · chatman-media/sales

chatman-media · 2026-05-17T19:46:16Z

Summary

Five self-contained improvements surfaced by a codebase audit — no filler.

test: consolidate test files into src/__tests__/ and drop four stale duplicate test files that were running redundantly.
refactor: extract the tolerant LLM-JSON parser (strip think-tags / code fences / parse / outer-block fallback) that was reimplemented in four places into src/llm-json.ts; route coach, judge, pairwise, and stage-classifier through it.
test: add coverage for composeSystemPrompt (previously untested) — persona/framework sections, few-shot toggle, conditional KB context, disclosure branch, persona facts.
test: add coverage for stage-classifier (previously untested) — parseClassifierOutput and classifyStage's regex fallback paths.
fix: surface self-play persistence failures via a persisted boolean on SelfPlayMatchResult / PairwiseMatchResult — a failed insert was previously only console.warn'd, masking silent data loss.

Test plan

bun run typecheck — passes
bun test — 115 pass / 0 fail
bun run check — biome clean
bun run build — bundle + .d.ts emit succeed

Note: the fix: commit will trigger a semantic-release patch publish on merge to main.

🤖 Generated with Claude Code

Four test files existed twice — older copies at src/ root and newer, maintained copies under src/__tests__/. bun test ran both, so the same assertions executed redundantly. Remove the stale root copies and move shadow-eval.test.ts under __tests__ so every test lives in one place. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The same strip-think-tags / strip-code-fences / JSON.parse / extract-outer- block logic was reimplemented in four places (coach, judge, pairwise, stage-classifier). Extract it into src/llm-json.ts as extractJsonObject and route all four call sites through it; each caller keeps its own domain-specific normalization and last-resort regex fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

composeSystemPrompt is a pure, deterministic function with no test coverage. Add tests for persona/framework sections, the few-shot toggle, conditional KB-context injection, the human-persona disclosure branch, and conditional rendering of the persona facts section. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

stage-classifier.ts had no test coverage. Add tests for parseClassifierOutput (code fences, prose prefixes, percentage-style confidence clamping, malformed input) and for classifyStage's regex fallback paths — driven by a stub ChatClient — covering llm-error, parse-error, unknown-stage, low-confidence, and the happy LLM path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

A failed match insert in runSelfPlayMatch and runPairwiseMatch was only console.warn'd — the id was silently left null, so callers running evaluation loops had no clear signal their results were never recorded. Add an explicit persisted boolean to SelfPlayMatchResult and PairwiseMatchResult so consumers can detect the data loss. Non-throwing, additive change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

coderabbitai · 2026-05-17T19:46:22Z

Warning

Rate limit exceeded

@chatman-media has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 21 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cec97477-5502-4abe-b38e-d502f0425461

📥 Commits

Reviewing files that changed from the base of the PR and between 2ac4dcd and 9e1a460.

📒 Files selected for processing (14)

src/__tests__/llm-json.test.ts
src/__tests__/prompt.test.ts
src/__tests__/shadow-eval.test.ts
src/__tests__/stage-classifier.test.ts
src/ab-router.test.ts
src/coach.ts
src/elo.test.ts
src/llm-json.ts
src/self-play/judge.ts
src/self-play/orchestrator.ts
src/self-play/pairwise.ts
src/skill-recommendations.test.ts
src/stage-classifier.ts
src/stage-router.test.ts

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/adoring-swanson-d5eabf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-17T20:02:11Z

🎉 This PR is included in version 0.2.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

chatman-media and others added 5 commits May 18, 2026 02:38

chatman-media merged commit 2294854 into main May 17, 2026
3 checks passed

github-actions Bot added the released label May 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test cleanup, shared LLM-JSON parser, and self-play persistence fix#3

Test cleanup, shared LLM-JSON parser, and self-play persistence fix#3
chatman-media merged 5 commits into
mainfrom
claude/adoring-swanson-d5eabf

chatman-media commented May 17, 2026

Uh oh!

coderabbitai Bot commented May 17, 2026

Rate limit exceeded

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chatman-media commented May 17, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented May 17, 2026

Rate limit exceeded

Uh oh!

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant