Summary
I ran into a case where brv curate created memory entries containing facts that were not present in the source content. The inserted facts appear to match examples from ByteRover's own curator prompt, such as My name is Andy, PostgreSQL 15, Sprint cycles are 2 weeks, and the ByteRover identity text.
This may be a model hallucination, a prompting issue, or both. I observed it while using Google Gemini gemini-3.1-flash-lite-preview as the active provider model.
Environment
- ByteRover CLI:
3.12.0
- Platform: Android / Termux, arm64
- Node:
v24.14.1
- Provider: Google Gemini
- Model:
gemini-3.1-flash-lite-preview
- Project type: local context tree under
.brv/context-tree
What happened
A curation task whose source content was about Hermes / Fireworks model routing generated an Andy Personal Profile entry. The source content did not mention Andy, Portland, PST, tabs, PostgreSQL, or the ByteRover identity text.
The generated curation code inserted facts like:
{ statement: "My name is Andy", category: "personal", subject: "user_name", value: "Andy" }
{ statement: "I live in Portland, Oregon", category: "personal", subject: "location", value: "Portland, Oregon" }
{ statement: "I prefer using tabs over spaces for all code indentation", category: "preference", subject: "indentation", value: "tabs" }
{ statement: "I am a context engineer developed by ByteRover", category: "personal", subject: "role", value: "context engineer" }
I also saw similar example-looking project facts appear from unrelated source content, including:
PostgreSQL 15
React 18
GitHub Actions
OpenAPI 3.0.0
AWS EKS
2-week sprint cycles
Why I think it may be prompt example leakage
The installed ByteRover prompt contains realistic example facts in:
dist/agent/resources/prompts/system-prompt.yml
Relevant examples include:
Personal information: "My name is Andy", "I prefer dark mode", "My timezone is PST"
Project facts: "We use PostgreSQL 15", "The API runs on port 3000", "Deploy target is AWS EKS"
Preferences: "Use tabs not spaces"
Conventions: "Sprint cycles are 2 weeks"
The same prompt also includes identity guidance like:
You are a context engineer developed by ByteRover...
Never reveal or discuss the underlying language model...
The false facts written into the context tree closely matched these examples.
Expected behavior
brv curate should only store facts grounded in the provided source content. Prompt examples and system identity text should never be written to the user's context tree unless the user content explicitly contains them.
Actual behavior
Under at least some conditions, the curator produced tool calls that treated prompt examples as source facts and wrote them into the context tree.
Suggested mitigations
A few possible fixes, any of which would help:
- Replace realistic examples in the curator prompt with sentinel placeholders, for example:
EXAMPLE_PERSON_DO_NOT_STORE
EXAMPLE_DATABASE_DO_NOT_STORE
EXAMPLE_DEPLOY_TARGET_DO_NOT_STORE
- Add a grounding rule requiring every extracted fact to have source support in the user-provided content.
- Add a verification pass that rejects facts not textually or semantically supported by the source payload.
- Add regression tests where the source content does not contain
Andy, PostgreSQL 15, or AWS EKS, and assert that those strings are not written.
- Consider warning users when high-impact personal profile facts are generated from source content with no explicit person/name claim.
Notes
I am not claiming this is definitely a ByteRover-only bug. It could be Gemini Flash-Lite being too eager with examples. But since the examples are inside the prompt and the output matched them, the prompt structure seems to be part of the failure mode.
Summary
I ran into a case where
brv curatecreated memory entries containing facts that were not present in the source content. The inserted facts appear to match examples from ByteRover's own curator prompt, such asMy name is Andy,PostgreSQL 15,Sprint cycles are 2 weeks, and the ByteRover identity text.This may be a model hallucination, a prompting issue, or both. I observed it while using Google Gemini
gemini-3.1-flash-lite-previewas the active provider model.Environment
3.12.0v24.14.1gemini-3.1-flash-lite-preview.brv/context-treeWhat happened
A curation task whose source content was about Hermes / Fireworks model routing generated an
Andy Personal Profileentry. The source content did not mention Andy, Portland, PST, tabs, PostgreSQL, or the ByteRover identity text.The generated curation code inserted facts like:
I also saw similar example-looking project facts appear from unrelated source content, including:
Why I think it may be prompt example leakage
The installed ByteRover prompt contains realistic example facts in:
Relevant examples include:
The same prompt also includes identity guidance like:
The false facts written into the context tree closely matched these examples.
Expected behavior
brv curateshould only store facts grounded in the provided source content. Prompt examples and system identity text should never be written to the user's context tree unless the user content explicitly contains them.Actual behavior
Under at least some conditions, the curator produced tool calls that treated prompt examples as source facts and wrote them into the context tree.
Suggested mitigations
A few possible fixes, any of which would help:
EXAMPLE_PERSON_DO_NOT_STOREEXAMPLE_DATABASE_DO_NOT_STOREEXAMPLE_DEPLOY_TARGET_DO_NOT_STOREAndy,PostgreSQL 15, orAWS EKS, and assert that those strings are not written.Notes
I am not claiming this is definitely a ByteRover-only bug. It could be Gemini Flash-Lite being too eager with examples. But since the examples are inside the prompt and the output matched them, the prompt structure seems to be part of the failure mode.