Fix JSON parse failure when LLM returns unescaped control chars in sampling responses#50
Fix JSON parse failure when LLM returns unescaped control chars in sampling responses#50m-nash wants to merge 11 commits into
Conversation
When a user asks for clarification about a CI failure recommendation and the agent arrives at a different conclusion, the updated recommendation was being lost. ProcessTaskComplete re-called ProcessInvestigationComplete which always read the stale SuggestedFix from state. Now: propagate LastRecommendation to SuggestedFix before re-presenting choices, and instruct the agent to include data.recommendation when the analysis changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Include the PR title and description in the ExplainCommentAsync sampling prompt so the LLM can assess whether review suggestions fall within the PR's scope before defaulting to implement. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Mark PR title/body, review comments, and diffs as untrusted data in the sampling system prompt. Wrap untrusted sections with BEGIN/END delimiters to prevent embedded instructions from being followed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Verify PR title, body, truncation, empty body handling, and untrusted content guardrail in ExplainCommentAsync prompts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace three duplicate private Truncate methods and inline truncation with a single extension method. Unifies suffix to use '...' consistently. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…mpling responses The LLM frequently emits literal newline characters (0x0A) inside JSON string values, which System.Text.Json correctly rejects. Add a SanitizeJsonControlChars step in SampleStructuredAsync that escapes bare control characters within JSON strings before parsing. Also add a retry to explain_comment to match the existing compose_reply pattern, as a fallback for any remaining transient parse failures. Fixes repeated 'Sampling failed to explain comment' errors observed during PR #133 monitoring. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes structured sampling failures when the LLM returns invalid JSON containing unescaped control characters inside string values, and improves robustness/consistency of sampling-driven flows.
Changes:
- Add JSON sanitization to escape bare control characters inside JSON string values before
System.Text.Jsondeserialization. - Add a one-time retry for
explain_commentsampling (to match existingcompose_replyretry behavior). - Extend prompt/context used for
explain_comment(include PR title/body) and propagate updated recommendations in CI failure flow; add/expand unit tests for these behaviors.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| PrCopilot/src/PrCopilot/Tools/SamplingHelper.cs | Adds SanitizeJsonControlChars and uses it in structured sampling; expands explain-comment prompt and includes PR context (title/body). |
| PrCopilot/src/PrCopilot/Tools/MonitorFlowTools.cs | Retries explain_comment once on null/parse failure; updates truncation usage; records PR body into monitor state. |
| PrCopilot/src/PrCopilot/StringExtensions.cs | Introduces shared Truncate() extension used across multiple components for consistent logging/prompt building. |
| PrCopilot/src/PrCopilot/StateMachine/PrStatusFetcher.cs | Fetches PR body via gh api and populates PrInfo.Body. |
| PrCopilot/src/PrCopilot/StateMachine/PrInfo.cs | Adds Body field to PR info model. |
| PrCopilot/src/PrCopilot/StateMachine/MonitorState.cs | Adds PrBody to persisted monitor state for downstream prompt/context usage. |
| PrCopilot/src/PrCopilot/StateMachine/MonitorTransitions.cs | Propagates updated LastRecommendation into SuggestedFix in CI failure flow; switches truncation helper to extension. |
| PrCopilot/src/PrCopilot/StateMachine/GitHubCliExecutor.cs | Switches local truncation helper to shared Truncate() extension. |
| PrCopilot/tests/PrCopilot.Tests/SamplingHelperTests.cs | Adds tests for PR context injection, sanitization behavior, and end-to-end structured sampling parse with literal newlines. |
| PrCopilot/tests/PrCopilot.Tests/StateMachineTests.cs | Adds CI failure flow tests ensuring updated recommendations propagate correctly and affect available choices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Scan for any char < 0x20 before allocating StringBuilder and doing the full walk. Most JSON responses have no control chars, so this avoids unnecessary allocation in the common case. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| state.PendingReplyText = reply.ReplyText; | ||
| state.SamplingCompletionEvent = completionEvent; | ||
| DebugLogger.Log("Sampling", $"Reply composed: {Truncate(reply.ReplyText, 100)}"); | ||
| DebugLogger.Log("Sampling", $"Reply composed: {reply.ReplyText.Truncate(100)}"); | ||
| return true; |
There was a problem hiding this comment.
reply.ReplyText.Truncate(100) relies on the PrCopilot.StringExtensions extension method, but this file is in namespace PrCopilot.Tools; and doesn’t import using PrCopilot; (and there are no global usings). This will not compile unless the extension namespace is in scope.
|
|
||
| private static string Truncate(string text, int maxLength) => | ||
| text.Length <= maxLength ? text : text[..maxLength] + "..."; | ||
| text.Truncate(maxLength); |
There was a problem hiding this comment.
text.Truncate(maxLength) relies on the PrCopilot.StringExtensions extension method, but this file is in namespace PrCopilot.StateMachine; and doesn’t import using PrCopilot; (and there are no global usings). This will not compile unless the extension namespace is in scope.
| text.Truncate(maxLength); | |
| text.Length <= maxLength ? text : text[..maxLength]; |
| return text ?? ""; | ||
| return text[..maxLength] + "..."; | ||
| } | ||
| } |
There was a problem hiding this comment.
StringExtensions.Truncate is declared in the PrCopilot namespace, but it’s used as an extension method from PrCopilot.Tools and PrCopilot.StateMachine without a using PrCopilot; (and there are no global usings in this project). As-is, calls like text.Truncate(...) won’t compile. Consider either adding using PrCopilot; where it’s used, or introducing a GlobalUsings.cs with global using PrCopilot; so the extension is in scope project-wide.
| } | |
| } | |
| namespace PrCopilot.Tools; | |
| internal static class StringExtensions | |
| { | |
| /// <summary> | |
| /// Truncates a string to the specified maximum length, appending "..." if truncated. | |
| /// Returns empty string for null input. | |
| /// </summary> | |
| internal static string Truncate(this string? text, int maxLength) | |
| { | |
| if (string.IsNullOrEmpty(text) || text.Length <= maxLength) | |
| return text ?? ""; | |
| return text[..maxLength] + "..."; | |
| } | |
| } | |
| namespace PrCopilot.StateMachine; | |
| internal static class StringExtensions | |
| { | |
| /// <summary> | |
| /// Truncates a string to the specified maximum length, appending "..." if truncated. | |
| /// Returns empty string for null input. | |
| /// </summary> | |
| internal static string Truncate(this string? text, int maxLength) | |
| { | |
| if (string.IsNullOrEmpty(text) || text.Length <= maxLength) | |
| return text ?? ""; | |
| return text[..maxLength] + "..."; | |
| } | |
| } |
| // Quick scan: if no control chars exist, return as-is (common fast path). | ||
| // Structural whitespace (\n, \r, \t) outside strings is valid JSON, | ||
| // but inside strings they must be escaped — so only inputs containing | ||
| // control chars need the full walk. | ||
| for (int i = 0; i < json.Length; i++) | ||
| { | ||
| if (json[i] < 0x20) | ||
| goto NeedsSanitization; | ||
| } | ||
|
|
||
| return json; | ||
|
|
||
| NeedsSanitization: | ||
| var sb = new StringBuilder(json.Length); |
There was a problem hiding this comment.
The current fast-path in SanitizeJsonControlChars triggers on any control char anywhere in the JSON, including structural newlines/tabs outside strings (which are valid JSON). That means pretty-printed JSON (very common for LLM output) will always allocate a StringBuilder and copy the entire payload even when there are no invalid characters inside string values. Consider delaying allocation until you actually encounter a control character while inString == true (e.g., build only when needed, otherwise return the original input).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem
During PR #133 monitoring (Azure/azure-cdk), the \�xplain_comment\ sampling step repeatedly failed with:
\
Internal error: Sampling failed to explain comment — LLM returned response that could not be parsed
\\
The debug log revealed the root cause: the LLM returns JSON with literal newline characters (0x0A) inside string values (the
ecommendation\ field), which \System.Text.Json\ correctly rejects per RFC 8259.
Fix
1. Root cause fix: \SanitizeJsonControlChars\ in \SampleStructuredAsync\
Added a new \SanitizeJsonControlChars()\ method that walks JSON text, tracks string boundaries (respecting escape sequences), and replaces bare control characters (U+0000–U+001F) with their proper JSON escape sequences (\\n, \\t, \\r, etc.).
This runs in \SampleStructuredAsync\ after \StripCodeFences, so it fixes the issue for all structured sampling calls, not just \�xplain_comment.
2. Added retry to \�xplain_comment\
The \compose_reply\ path already retried once on parse failure. The \�xplain_comment\ path did not — it threw immediately. Now both paths have consistent retry behavior.
Testing