Skip to content

[TICK 3-001] Fix Trust Gate JSON parsing for markdown-fenced LLM responses#6

Merged
timeleft-- merged 3 commits intomainfrom
builder/tick-5-bug-trust-gate-json-parsing-fa
Feb 25, 2026
Merged

[TICK 3-001] Fix Trust Gate JSON parsing for markdown-fenced LLM responses#6
timeleft-- merged 3 commits intomainfrom
builder/tick-5-bug-trust-gate-json-parsing-fa

Conversation

@timeleft--
Copy link
Copy Markdown
Member

Summary

  • Adds _extract_json_from_llm_response() to trust_gate.py that strips markdown code fences (```json / ``` ``), leading/trailing whitespace, and extracts the first JSON object from preamble-wrapped responses before callingjson.loads()`
  • Wires the sanitizer into _parse_verdict() (replacing bare json.loads(content) at line 223)
  • Adds 8 new test cases covering all sanitization paths, including an end-to-end test through review_thought()

Root Cause

trust_gate.py:223 called json.loads(content) directly on the raw LLM response. Gemini 2.5 Flash occasionally wraps its structured JSON output in markdown fences even when response_format: json_object is set, causing a JSONDecodeError at char 1. After the retry, the thought ended up with validation_status: "error" instead of the reviewer's actual verdict.

Test plan

  • test_extract_json_fenced_with_lang_tag```json ... ```
  • test_extract_json_fenced_no_lang_tag``` ... ```
  • test_extract_json_leading_trailing_whitespace
  • test_extract_json_with_preamble_text — "Here is my response: {...}"
  • test_extract_json_clean_no_fences — clean JSON passes unchanged
  • test_extract_json_genuinely_invalid — no JSON → json.loads() error preserved (fail-closed intact)
  • test_extract_json_nested_braces — nested {} handled (first { to last })
  • test_review_thought_fenced_json_response — end-to-end: fenced response → correct verdict
  • All 176 existing tests pass (no regressions)

Fixes: Issue #5

🤖 Generated with Claude Code

timeleft-- and others added 3 commits February 25, 2026 10:05
Trust gate _parse_verdict() fails when Gemini Flash wraps JSON in
markdown code fences. Amend spec 3 and plan 3 with TICK-001 adding
_extract_json_from_llm_response() utility and test coverage.

Fixes #5

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _extract_json_from_llm_response() sanitization utility to trust_gate.py
that strips markdown code fences (```json or ```), leading/trailing whitespace,
and extracts the first JSON object from preamble-wrapped responses before
passing to json.loads(). Wire into _parse_verdict() to fix the bug where
Gemini 2.5 Flash fence-wrapped responses caused validation_status: "error"
instead of properly applying the reviewer's verdict.

Add 8 new tests covering all sanitization paths. All 176 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace npm run build / npm test (codev monorepo checks) with
uv run pytest commands appropriate for this Python project.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@timeleft-- timeleft-- merged commit 9bb3d18 into main Feb 25, 2026
0 of 2 checks passed
timeleft-- added a commit that referenced this pull request Feb 27, 2026
…ate-json-parsing-fa

[TICK 3-001] Fix Trust Gate JSON parsing for markdown-fenced LLM responses
@timeleft-- timeleft-- deleted the builder/tick-5-bug-trust-gate-json-parsing-fa branch March 4, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant