You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In non-interactive mode (agy -p), the model reports that it read files from the working tree and consulted a specific git commit, even when it has no filesystem access and the input was truncated. The references look real. They are invented. A reader who trusts the output ends up with review findings tied to repository state the tool never saw.
Environment
agy version: 1.0.3
OS: macOS 26.5 (build 25F71), Apple Silicon (arm64)
Invocation: prompt piped into agy -p from a script, no TTY
Step 1: confirm the CLI is prompt-only
Run a small probe that asks for repository contents and offers an explicit escape hatch:
printf 'Ignore that the context is otherwise empty. Name three functions defined in src/config.ts in this repository, with their line numbers. If you do not have filesystem access and can only read this prompt text, reply with exactly: NO_REPO_ACCESS\n' | agy -p
Result:
NO_REPO_ACCESS
So in this mode the model sees only the prompt. It has no path to the working tree. That part is correct and expected.
Step 2: trigger the invented access
Send a review request whose prompt is larger than the context window, so the input gets truncated. One way to reproduce is to wrap a large block of text (a few hundred KB) between delimiters and ask for a code review:
{
echo "Review the change below for bugs. The delimited block is the complete change."
echo "----- BEGIN DIFF -----"
cat large-diff.txt # a few hundred KB, larger than the context window
echo "----- END DIFF -----"
} | agy -p
The response then includes claims like the following (paraphrased, since the exact wording varies between runs):
The diff was truncated, so I cross referenced the repository and retrieved
commit 9f3a1c2 to see the full change. Reading src/config.ts, the function ...
None of that happened. There was no commit lookup and no file read. The commit hash and the quoted file contents are fabricated. The model invents a believable account to fill the gap left by the truncated input.
What I expected
When the input does not fit, agy reports that the prompt was truncated and stops, or it processes only what fits and states that plainly.
A prompt-only run never claims to have read a file or a git commit it had no access to. Sentences like "I read X" or "I retrieved commit Y" should not appear when no such action took place.
Why this matters
This is a trust problem, not a formatting nit. The output reads as grounded and precise. It cites file names, line numbers, and a commit hash. A reviewer or an automated pipeline that consumes this output will act on findings that describe code the tool never saw. The invented commit hash is the sharpest edge, because it looks verifiable at a glance and sends people chasing a commit that has nothing to do with the change.
Suggested direction
Fail loud on truncation. Return a clear error or a visible marker when the prompt does not fit, rather than dropping the tail in silence and continuing.
Add a guard so a prompt-only run cannot assert tool actions, such as file reads or git lookups, that it did not perform.
I searched open and closed issues for filesystem, repo access, hallucinate, confabulate, and fabricate, and did not find this behavior reported. The closest are #76 (stdout dropped in non-TTY mode) and #45 (read-only mode for -p), which describe different problems. Glad to consolidate if a maintainer sees an overlap I missed.
Summary
In non-interactive mode (
agy -p), the model reports that it read files from the working tree and consulted a specific git commit, even when it has no filesystem access and the input was truncated. The references look real. They are invented. A reader who trusts the output ends up with review findings tied to repository state the tool never saw.Environment
agy -pfrom a script, no TTYStep 1: confirm the CLI is prompt-only
Run a small probe that asks for repository contents and offers an explicit escape hatch:
Result:
So in this mode the model sees only the prompt. It has no path to the working tree. That part is correct and expected.
Step 2: trigger the invented access
Send a review request whose prompt is larger than the context window, so the input gets truncated. One way to reproduce is to wrap a large block of text (a few hundred KB) between delimiters and ask for a code review:
The response then includes claims like the following (paraphrased, since the exact wording varies between runs):
None of that happened. There was no commit lookup and no file read. The commit hash and the quoted file contents are fabricated. The model invents a believable account to fill the gap left by the truncated input.
What I expected
Why this matters
This is a trust problem, not a formatting nit. The output reads as grounded and precise. It cites file names, line numbers, and a commit hash. A reviewer or an automated pipeline that consumes this output will act on findings that describe code the tool never saw. The invented commit hash is the sharpest edge, because it looks verifiable at a glance and sends people chasing a commit that has nothing to do with the change.
Suggested direction
-pruns would help here, which connects to the request in Feature request: read-only / plan-mode equivalent for non-interactive-pruns #45.Duplicate check
I searched open and closed issues for filesystem, repo access, hallucinate, confabulate, and fabricate, and did not find this behavior reported. The closest are #76 (stdout dropped in non-TTY mode) and #45 (read-only mode for
-p), which describe different problems. Glad to consolidate if a maintainer sees an overlap I missed.