Skip to content

Proj/byterover tool mode#663

Open
danhdoan wants to merge 180 commits into
mainfrom
proj/byterover-tool-mode
Open

Proj/byterover tool mode#663
danhdoan wants to merge 180 commits into
mainfrom
proj/byterover-tool-mode

Conversation

@danhdoan
Copy link
Copy Markdown
Collaborator

Summary

Lands the full byterover tool-mode project end-to-end:

  • HTML context-tree format. The 19-element <bv-*> vocabulary
    (bv-topic, bv-task, bv-pattern, bv-rule, bv-diagram, …) is
    now the on-disk format for curated topics. Parser, writer, validator
    registry, element-axis index, and LLM-ready renderer all ship in
    src/server/infra/render/. Markdown topics keep working via the
    format detector + extension-aware reader.

  • Tool mode is the default. Both brv curate and brv query run
    without any byterover LLM provider configured. BRV_*_TOOL_MODE
    env-var gates have been removed (ENG-2815). The calling agent owns
    synthesis; byterover retrieves, renders, validates, and writes.

  • MCP migrated to provider-free. brv-query swaps to
    task:create type='query-tool-mode' and returns the structured
    envelope (ENG-2836). brv-curate is replaced by
    brv-curate-html — accepts pre-authored <bv-topic> HTML and routes
    through the writer directly, no LLM (ENG-2837). After this lands,
    MCP integrations (Claude Desktop / Cursor MCP / etc.) inherit the
    no-provider property the CLI already has.

  • New brv read <path> command (ENG-2769) for single-topic render

    • inspection.
  • Editorial topic viewer in webui (ENG-2810) for bv-* HTML
    topics, hardened against untrusted HTML.

  • Telemetry producers (ENG-2741) attach token/latency/format
    metadata to QueryLogEntry and CurateLogEntry via a shared
    TaskUsageAggregator.

What's new

Curate

  • HTML writer with <bv-topic> schema validation and atomic
    tmp+rename writes (ENG-2737/2738/2739).
  • CurateOrchestrator state machine: generate → validate → correct
    → write → retry loop (ENG-2766).
  • Prompt builder generated from ELEMENT_REGISTRY, with
    prompt-injection-hardened delimiters (ENG-2767).
  • CLI surface uses in-process kickoffSession / continueSession
    instead of the legacy daemon runAgentBody path (ENG-2765).
  • System-managed createdat / updatedat injected at write time —
    the model is not allowed to author these.
  • confirmOverwrite guard returns existing content for the caller to
    merge instead of clobbering silently.

Query

  • Tool-mode executor runs Tier 0/1 cache + Tier-2 retrieval without
    the canRespondDirectly confidence threshold (ENG-2811/2812). The
    threshold was a byterover-side judgment; calling agents now own the
    "is this match good enough?" decision.
  • Tier 3/4 LLM-driven synthesis removed from the executor path.
  • HTML-routing in SearchKnowledgeService so HTML topics render
    identically to MD in Tier 2.
  • New docs/curate-protocol.md documents the wire envelope.

MCP

  • brv-query migrated to the tool-mode envelope; renders matched
    topics as markdown sections.
  • brv-curate replaced by brv-curate-html (breaking — see below).
  • Shared encoders: src/shared/transport/query-tool-mode-content.ts
    and src/shared/transport/curate-html-content.ts.

SKILL.md

  • Tool-mode curate session protocol documented (ENG-2768).
  • Query tool-mode section added with skippedSharedCount semantics
    (ENG-2814).
  • Deprecated LLM-provider references scrubbed (ENG-2822).

Breaking changes

  • MCP brv-curate tool input shape changes. The previous
    {context, files, folder} natural-language inputs are gone. The new
    tool (brv-curate-html / curate-html-direct task type) accepts
    {html, confirmOverwrite?} — pre-authored <bv-topic> HTML from
    the calling agent. MCP integrations relying on the natural-language
    curate input must update. See src/server/templates/skill/SKILL.md
    for the authoring guide.

  • BRV_CURATE_TOOL_MODE / BRV_QUERY_TOOL_MODE env vars removed.
    No-op now — tool mode is the only path. Remove from any local
    scripts.

  • useHtmlContextTree flag removed. HTML is the only curate
    output format. Existing markdown topics still read fine via the
    format detector; new writes are always HTML.

Tickets shipped

ENG-2737, ENG-2738, ENG-2739, ENG-2740, ENG-2741, ENG-2765, ENG-2766,
ENG-2767, ENG-2768, ENG-2769, ENG-2810, ENG-2811, ENG-2812, ENG-2814,
ENG-2815, ENG-2822, ENG-2836, ENG-2837.

Rollout

  • Existing .brv/context-tree/*.md topics keep working — read path
    is format-aware.
  • New curate writes land as .html. A separate migration script
    (Phat, parallel work) can re-author legacy MD topics into the
    <bv-topic> HTML format and stash originals under
    .brv/_archive/context-tree-md/. Not included in this PR.
  • Users following the "no provider needed" announcement and running
    brv providers disconnect <all> will now find MCP brv-query and
    brv-curate-html work end-to-end — same property as the CLI.

Test plan

  • npm run typecheck clean
  • npm run lint clean
  • npm test — full unit + integration suite (153 files changed,
    ~15k LOC added; new tests cover HTML reader/writer/elements,
    curate-session, query-retrieval, MCP tool-mode envelopes,
    telemetry aggregator, format detectors, prompt builder)
  • npm run build clean
  • Manual: 0-provider CLI curate + query round trip on a fresh
    .brv/
  • Manual: 0-provider MCP brv-query + brv-curate-html from
    Claude Desktop or Cursor MCP
  • Manual: HTML topic round-trip through webui editorial viewer
  • Verify migration script (Phat's parallel work) lands separately
    and does not block this PR

danhdoan and others added 30 commits May 10, 2026 12:25
…rser

Lands the M1 keystone task: element vocabulary, per-element validators,
data-driven registry, and HTML parser wrapper. No behavior change yet —
no consumer wires the registry until T3 (curate path) and T4 (query path).

Element vocabulary (5 elements):
- bv-topic — root container; required `path`, optional importance/maturity/
  recency/updatedat (lowercase per HTML5 normalization)
- bv-rule — rule statements (severity, id)
- bv-decision — decision records (id)
- bv-bug — bug runbook entries (id, severity)
- bv-fix — fix runbook entries (id)

Architecture:
- Data-driven `ELEMENT_REGISTRY` is the production-track guardrail. M2
  vocabulary expansion is purely additive — new elements register via
  one entry + one schema/validator pair, no consumer code changes.
- `makeAttributeValidator` factory eliminates per-element boilerplate;
  reduces M2's 12 additional element validators to one-line bindings.
- HTML parser is parser-library-agnostic at the API boundary
  (`DocumentNode` / `ElementNode` / `TextNode`). parse5 used internally;
  consumers don't see parse5 types.
- Attribute-case normalization is documented in `ElementNode` JSDoc:
  HTML5 lowercases attribute names at parse time, so schema keys must
  match parser output (lowercase), not source HTML (camelCase).

Tests (106 new):
- 5 per-element validator suites (61 cases total) — each ≥10 cases
- Registry shape + wiring + metadata (14 cases)
- HTML parser: basic parsing, malformed-input handling, walkElements,
  getInnerText, serializeHtml round-trip (22 cases)
- End-to-end round-trip on a sample fixture covering all 5 elements
  (9 cases)

Verification:
- typecheck clean
- lint 0 errors (242 warnings, all pre-existing baseline)
- 7715 tests passing, 0 failing
- build clean

Adds parse5@^8.0.1 as a runtime dep — the W3C-spec HTML parser used by
jsdom; types ship in-tree (no @types/parse5 needed). Justified vs hand-
rolled parsing because robust HTML tag-soup handling is non-trivial and
this is the standard library for the job.
- bv-topic schema: tighten recency regex to ^\d+(\.\d+)?$ (rejects '.', '..1', '1..2' which previously slipped past the regex), drop redundant Number.isFinite for style consistency with importance, accept timezone offsets in updatedat via {offset: true}
- html-parser: replace `as unknown as Parse5DocumentFragment` cast with parse5 defaultTreeAdapter factories (createDocumentFragment, createElement, createTextNode, appendChild) — no more structural casts in the serialize path
- html-parser: getInnerText now inserts a separator between sibling element children so compact emit (<p>foo.</p><p>bar.</p>) does not merge tokens for BM25
- html-parser: document <template> content limitation (parse5 places template children in .content, not childNodes)
- schema docstrings + test names: replace 'warn-only behaviour' with 'parse-and-skip' (no warning is actually emitted in M1)
- sample-topic round-trip: add explicit case-normalization regression test for updatedAt → updatedat
feat: [ENG-2737] HTML render layer foundation — element registry + parser
Rewrite the canonical curate tool description (loaded by ToolDescriptionLoader
and presented to the agent on every curate operation) from the structured
JSON-args shape (rawConcept/narrative/facts/etc.) to the M1 HTML-emission
contract using the closed `<bv-*>` vocabulary defined by T1.

Coverage:
- All 5 elements (bv-topic, bv-rule, bv-decision, bv-bug, bv-fix) with full
  per-element attribute schemas: required/optional, value enums, format
  constraints (importance integer 0-100, recency numeric 0-1, ISO-8601
  updatedat, severity enums per element, etc.).
- Allowed-children semantics (any/inline/block) + pairing convention
  (bug+fix as siblings).
- Path format (slash-separated snake_case) + domain guidelines.
- Output contract: HTML only, exactly one bv-topic root, no preamble/fences/
  commentary, lowercase attribute names per HTML5 spec, closed vocabulary,
  no invented attributes, no clarifying questions.
- Three worked examples (bug+fix runbook, rule+decision pair, general topic)
  to anchor the model's element-pairing and id-naming conventions.

Add a sanity test (test/unit/server/infra/render/curate-prompt.test.ts) that
loads curate.txt and asserts every registered ELEMENT_NAME, every per-element
enum value, and every output-contract directive is present. Guards against
silent drift when M2 expands the vocabulary or refactors attribute schemas.

Authoring fluency check (M1 T2 spike, separate harness in
local-auto-test/curate-fluency/, not in this repo): 20/20 generated outputs
validated against the M1 element registry on Sonnet 4.5. Gate cleared on
first full run after one prompt-tuning iteration. Decision: proceed to T3.
Full report: research repo at features/html-memory-conversion/milestones/01-experiment/fluency-report.md
Field-by-field review of curate.txt against the actual rendered .md
context files (frontmatter + ## Reason + ## Raw Concept + ## Narrative
+ ## Facts) surfaced two issues with the initial T2 draft:

1. <bv-topic> carried runtime-signal attributes (importance, maturity,
   recency, updatedat) that were explicitly migrated to a sidecar store
   under research/features/runtime-signals/ (per-user, per-machine state
   that caused noisy `vc status` diffs and team-share conflicts). Drop
   them from the schema; replace with the actual frontmatter fields the
   markdown writer renders: title (required), summary, tags, keywords,
   related.

2. The 5-element vocabulary covered Rules + three M1 net-new elements
   (Decision/Bug/Fix) but left every other rendered MD section as
   free-form HTML — losing the structural mapping the writer needs.

Expand M1 vocabulary to 16 elements (added 11), each mapping to a
specific section in the rendered .md output:

  - bv-reason       → ## Reason
  - bv-task,        → ## Raw Concept (Task / Changes / Files / Flow)
    bv-changes,
    bv-files,
    bv-flow
  - bv-structure,   → ## Narrative (Structure / Dependencies /
    bv-dependencies,                Highlights / Examples / Diagrams);
    bv-highlights,                  bv-rule already covered Rules
    bv-examples,
    bv-diagram (with type + title attrs for verbatim diagram preservation)
  - bv-fact (subject/category/value attrs) → ## Facts list

Each element follows the existing makeAttributeValidator pattern; the
data-driven registry stays the single source of truth.

Tests:
- Update bv-topic test to drop runtime-signal cases and assert the
  new required-attributes (path + title); add a regression case
  asserting passthrough still tolerates legacy importance/maturity/
  recency without enforcing them.
- Add registry test asserting (a) bv-topic exposes the new
  optionalAttributes, (b) runtime signals are NOT in the registry
  metadata, (c) every registered validator accepts its own empty node
  + rejects mismatched-tag.
- Consolidate the 9 attribute-free new elements into one shared
  text-only-elements.test.ts (same trivial schema; per-element
  duplication wouldn't add coverage).
- Add dedicated bv-diagram.test.ts (type enum) and bv-fact.test.ts
  (category enum, subject/value pass-through).
- Update sample-topic.html fixture to exercise every new element +
  the new frontmatter attributes; assert no runtime signals leak in.
- Update curate-prompt.test.ts to assert the expanded vocabulary,
  category/type enums for bv-fact/bv-diagram, and that the prompt
  explicitly excludes runtime signals from bv-topic attributes.

Re-run the M1 T2 fluency check on the same 20 fixtures with the new
vocabulary: 20 / 20 valid (gate cleared); 100% cohort-appropriate
placement (bug+fix → bv-bug+bv-fix; rule+decision → bv-rule+bv-decision;
general → bv-reason); 122 bv-fact instances extracted across the run
(~6 per fixture). Fluency report updated:
research/features/html-memory-conversion/milestones/01-experiment/fluency-report.md

Open contract violation persists: 18/20 outputs (90%) wrap in code
fences despite explicit prompt instruction. Recommendation for T3
unchanged — strip in the response parser, don't iterate the prompt.
Second writer audit (markdown-writer.ts:270-354) found 3 rawConcept
sub-fields without an HTML home: rawConcept.timestamp, rawConcept.author,
rawConcept.patterns. Add the missing elements so the M1 vocabulary
covers every section the writer renders.

Vocabulary additions (16 → 19 elements):

- <bv-timestamp> — text-only; renders as `**Timestamp:**` under
  `## Raw Concept`. Distinct from frontmatter createdAt/updatedAt
  (system-set) — this is the date the concept's data represents.
- <bv-author> — text-only; renders as `**Author:**`.
- <bv-pattern> — structured (flags + description as attrs, pattern
  as text content). Multiple siblings inside <bv-topic> collected
  into the `**Patterns:**` bullet list. Mirrors how <bv-fact> works.

Sanitiser:

- stripCodeFenceWrapper(html) added to html-parser.ts. Strips a single
  outer ``` ... ``` (any language tag) wrapper from the LLM response.
  Inner fences (e.g., <pre><code> blocks inside <bv-diagram>) survive.
  T3's curate executor calls this before parseHtml. Defensive
  sanitisation generalises better than chasing the model's quirk via
  prompt iteration — observed fence-wrap rate is 70% on Sonnet 4.5
  even with explicit prompt instruction to the contrary.

Tests:

- bv-pattern dedicated test (bv-pattern.test.ts) for the structured
  attrs.
- text-only-elements.test.ts extended for bv-timestamp + bv-author
  (same trivial schema as the other text-only elements).
- html-parser.test.ts: 7 new cases for stripCodeFenceWrapper covering
  ```html, ```xml, no-language ```, leading/trailing whitespace,
  unwrapped passthrough, mismatched-fence passthrough, inner-fence
  preservation, and parse-after-strip end-to-end.
- sample-topic-roundtrip.test.ts: renderable-MD coverage assertion
  extended to require all 15 dedicated body-section elements (was 12).
- sample-topic.html fixture exercises the 3 new elements.

Re-ran the M1 T2 fluency check on the same 20 fixtures with the
19-element vocabulary. Results:
- 20 / 20 valid (gate cleared)
- 100% cohort-appropriate placement (unchanged)
- bv-timestamp 60% adoption (used where input has a clear date)
- bv-author 10% (used only on inputs that name an owner)
- bv-pattern 0% — model correctly did not over-emit; no fixture
  contained regex content. Restraint is good behaviour.
- Cost: $0.61 (vs. $0.60 for the 16-element draft; negligible)

Updated fluency-report.md in research repo with final coverage matrix
and the 19-element → markdown-writer mapping for T3.
…Query/CurateLogEntry

Add per-call token usage (input, output, cached, cache-creation), latency tiers
(searchMs, llmMs, totalMs), and format mode (markdown vs html) to QueryLogEntry
and CurateLogEntry. Token rollup uses a per-task TaskUsageAggregator subscribed
to llmservice:usage events on the agent bus; LoggingContentGenerator emits the
events on both streaming and non-streaming paths. AiSdkContentGenerator folds
result.usage and providerMetadata.anthropic.cacheCreationInputTokens into a
synthesized rawResponse so per-provider extraction lands consistently across
Anthropic, OpenAI, and Google.

Live-validated against 30 curates × 3 providers; all four token fields populate
end-to-end with cacheCreationTokens correctly Anthropic-only.
[ENG-2738] HTML-emission contract in curate.txt + fluency check
- core/interfaces no longer reaches into infra: CurateUsageRecord moved to
  core/domain/entities/curate-log-entry, IUsageAggregator added in
  core/interfaces/telemetry/, TaskUsageAggregator now `implements` it.
- Consolidate curate telemetry assembly inside CurateExecutor:
  runAgentBody invokes reportTelemetry on both happy and error paths;
  agent-process passes onTelemetry that forwards via task:curateResult.
  Removes the duplicated inline assembly in agent-process.
- QueryExecutor.buildTiming prefers TaskUsageAggregator.getLlmMs() over the
  executeOnSession wall-clock so llmMs counts only LLM-call time, not
  tool-execution overhead. Falls back to wall-clock for tests.
- CurateLogHandler error finalization merges telemetryFields(state.usage)
  so failed curates don't underreport cost when the executor's best-effort
  error-path reportTelemetry has populated state.usage.
- OOD short-circuit in QueryExecutor routes through formatDetector.detect([])
  instead of returning format: undefined, so future HTML-aware detectors
  don't get silently bypassed.
- DRY: extract makeUsageListener helper in agent-process and reuse for
  both curate and query handlers.
- Document that PROVIDER_TYPES iteration in LoggingContentGenerator only
  matches 'aiSdk' in production (all live traffic flows through AiSdkContentGenerator).
…ocation

Re-review flagged the gap that error-path reportTelemetry being silent could
mask a future regression where failed-curate telemetry disappears from disk.
Pin both directions of the contract:
- happy path: onTelemetry fires once before runAgentBody returns
- error path: onTelemetry fires once before the throw propagates so the
  daemon can still emit task:curateResult ahead of task:error
feat: telemetry producer for QueryLogEntry + CurateLogEntry
…or wiring

Wires the M1 HTML context-tree experiment into the curate flow. When the
new BrvConfig.useHtmlContextTree flag is enabled, curate emits a
<bv-topic> HTML document using T1's element registry (19 elements after
T2's expansion) and atomically writes it to .brv/context-tree/<path>.html.
When the flag is off (the default), the existing markdown curate path
runs unchanged.

Rolled-in T2 review fixes (PR #628 auto-review):
- curate.txt: split the "Standard HTML inside <bv-*>" clause per
  allowed-children semantics. Inline-only subset (code/strong/em) for
  bv-rule, bv-task, bv-flow, bv-fact, bv-pattern, bv-timestamp,
  bv-author. Full block subset for bv-topic and the block-content
  elements. Removes the M1-passthrough/M2-strict drift the reviewer
  flagged.
- curate.txt: prefix the examples block with a disclaimer that the
  fences are for prompt readability only and the agent's actual output
  must be bare HTML. Lower-cost mitigation for the model's stubborn
  fence-wrap habit (T2's stripCodeFenceWrapper still handles the
  remaining cases).
- curate-prompt.test.ts: enum-value tests now anchor to the section
  after each element header instead of whole-file string match.
  Catches "severity moved from bv-bug to bv-decision" drift.
- curate-prompt.test.ts: new test that parses every fenced example
  block from curate.txt with parseHtml and runs each <bv-*> element
  through its registered validator. Catches example typos
  (severity="hihg") AND vocabulary drift in one assertion.

T3 proper:

BrvConfig:
- New optional boolean field useHtmlContextTree (default undefined ==
  false). Plumbed through the JSON validator, withReviewDisabled,
  withVersion, and withSpace withers. withoutSpace already used
  spread, so it picks up the new field automatically. 8 new BrvConfig
  unit tests cover round-trip + wither preservation.

format-detector (src/server/infra/render/format/format-detector.ts):
- Two pure functions: getFormatForWrite(config) reads the flag,
  getFormatForRead(filePath) inspects the extension. Default markdown.
  12 unit tests cover both branches plus edge cases (case-insensitive
  extensions, directory-named-foo.html with .md inside, etc.).

html-writer (src/server/infra/render/writer/html-writer.ts):
- Public surface: validateHtmlTopic(html) (pure validation, no I/O)
  and writeHtmlTopic({contextTreeRoot, rawHtml}) (validation + atomic
  write). Pipeline:
    1. stripCodeFenceWrapper (T2-provided) on raw input.
    2. parseHtml + walkElements; require exactly one <bv-topic> and
       a non-empty path attribute.
    3. Run every typed bv-* element through its registered validator.
       Reject on any failure with structured errors.
    4. Atomically write via DirectoryManager.writeFileAtomic to
       <contextTreeRoot>/<path>.html.
- Path-traversal hardening: backslashes normalised, `..` segments
  rejected, leading slashes stripped, resolved path must stay under
  contextTreeRoot. 15 unit tests cover validity branches, fence
  variants, atomic-write success/failure, and traversal attempts.

curate executor wiring:
- CurateExecuteOptions gains useHtmlContextTree?: boolean.
- runAgentBody branches the runtime prompt's final-step instruction:
  HTML mode tells the agent its FINAL RESPONSE is the bv-topic
  document and explicitly forbids tools.curate calls (which would
  write a sibling .md file). MD mode keeps the existing "verify via
  result.applied[].filePath" instruction unchanged.
- After agent.executeOnSession returns, HTML mode routes through a
  new handleHtmlCurateResponse() that calls writeHtmlTopic and shapes
  a CurationStatus from the result (success/failed). MD mode
  continues to call parseCurationStatus.
- 6 new unit tests cover: valid topic written, fence-wrapped response
  stripped, missing bv-topic → failed, invalid attribute → failed,
  flag=false uses MD path, flag=undefined uses MD path.

agent-process plumbing:
- The daemon's curate dispatcher reads cachedBrvConfig.useHtmlContextTree
  and forwards it on every CurateExecuteOptions call. Default false
  remains production-safe; the bench's Configuration B sets it to true
  via .brv/config.json.

Tests: 7892 pass, 0 fail. Typecheck clean, lint clean, build clean.

Out of scope (per the M1 plan):
- Storage path migration to .brv/render/ — M2.
- IR (ComponentIR) compilation — M2.
- Multi-tier query / structural selectors — T4.
- to-injection/to-evidence/to-summary/to-md adapters — M2.
- Migration tool for existing .brv/context-tree/ — M2.
HTML is the curate format; there is no MD/HTML branching to control.
Removing the flag simplifies the executor (one prompt, one response
handler), the BrvConfig schema (no per-project toggle), and the
format-detector (no write-side decision — only read-side dispatch
remains for legacy MD files).

Code changes:

- BrvConfig: remove the optional `useHtmlContextTree` field, the JSON
  validator branch, all wither plumbing, and the dedicated test block.
- ICurateExecutor: remove `useHtmlContextTree` from CurateExecuteOptions.
- CurateExecutor: collapse the prompt branch — every curate runs the
  HTML-emission instruction. Always route the agent's response through
  `handleHtmlCurateResponse`. Remove the dead `parseCurationStatus`
  method (the JSON-status-block path is gone).
- agent-process: drop the `useHtmlContextTree` plumbing on the curate
  dispatch site.
- format-detector: drop `getFormatForWrite` (no decision to make);
  keep `getFormatForRead` so the query/search read path can still
  route legacy `.md` files transparently. Drop the BrvConfig import.

Test changes:

- BrvConfig: drop the 8-case `useHtmlContextTree` describe block.
- format-detector: drop the 4 `getFormatForWrite` cases.
- curate-executor-html-mode: drop the two MD-fallback cases (no flag
  to toggle); the file is now strictly HTML-emission tests.

Comment cleanup (separate concern, same commit):

Strip milestone scaffolding (M1, M2, T1–T7) from comments across the
render layer. Comments now describe logic and decisions, not the
project plan that scaffolded them. Specifically:

- "M1 light validation; ADR-007 §13 strict validation is M2" →
  "Light validation; strict validation per ADR-007 §13 is future work"
- "M1 element registry"/"M1 vocabulary" → "element registry"/"closed
  bv-* vocabulary"
- "T3 wires the writer", "T4 owns the index", etc. → describe the
  consumer by role, not by milestone task
- "M1 keeps the existing .brv/context-tree/ storage" → describe what
  the resolver encodes, not which milestone made the decision
- Test descriptions: "(parse-and-skip — M1 light validation)" →
  "(parse-and-skip — light validation)"

Pipeline:
- 7878 tests pass (lost 14 with the dropped flag — 8 BrvConfig + 4
  format-detector write + 2 executor MD-fallback).
- Typecheck clean, lint clean, build clean.
End-to-end smoke testing surfaced a gap: the executor's runtime prompt
told the agent to emit HTML as its final response, but the base system
prompt (system-prompt.yml) had ~500 lines of detailed guidance for the
old structured-args curate flow (UPSERT/ADD/UPDATE/MERGE/DELETE with
rawConcept/narrative/facts content shape). The system prompt's
specifics overrode the runtime instruction; the agent kept building
JS data structures via code_exec and never emitted HTML, so the writer
saw nothing to validate and no .html file landed.

Rewrite the curate-related sections of system-prompt.yml to match the
HTML-emission contract:

- Update <TOOL_QUICK_REFERENCE>, <WORKFLOW_SUMMARY>,
  <PATH_TITLE_CLARIFICATION>, <PROHIBITED_BEHAVIORS>, <DIRECTORIES>,
  and the <SANDBOX_PROGRAMMING> curate example to instruct HTML
  emission. The agent's final response IS the bv-topic document; do
  NOT call tools.curate; do NOT wrap in code fences.
- Replace the entire ## Curation Workflow section (~500 lines, lines
  275–786) with a tighter HTML-focused workflow (~217 lines):
  - Keep the recon → mapExtract → dedup → groupBySubject preprocessing
    helpers (still useful for chunked/large inputs).
  - Document the bv-topic frontmatter (path/title required; summary,
    tags, keywords, related optional) and explicitly call out that
    importance/maturity/recency/updatedat are sidecar runtime signals,
    not authored.
  - Map MD sections to bv-* elements (Reason → bv-reason, Raw Concept
    sub-fields → bv-task/bv-changes/bv-files/bv-flow/bv-timestamp/
    bv-author/bv-pattern, Narrative subsections → bv-structure/
    bv-dependencies/bv-highlights/bv-rule/bv-examples/bv-diagram,
    Facts → bv-fact, runbooks → bv-decision/bv-bug/bv-fix).
  - Document inline-content vs block-content elements and the standard
    HTML allowed inside each.
  - Keep facts extraction guidance, content quality requirements,
    temporal preservation, and content preservation rules — translated
    to the bv-* element vocabulary.
  - Drop the operation-type table (UPSERT/ADD/UPDATE/MERGE/DELETE),
    Two-Part Context Model section, domainContext/topicContext/
    subtopicContext (these belonged to the structured-args flow that
    no longer exists).
- Update the query-side temporal-reasoning hint to reference
  bv-timestamp content + sidecar runtime signals instead of
  rawConcept/narrative timestamps.
- Update the related-topics search hint to reference the `related`
  attribute on bv-topic alongside the legacy `relations` frontmatter
  on .md topics.

Net change: 196 insertions, 494 deletions. The rewritten prompt is
~217 lines shorter and aligned with the actual emission contract in
curate.txt.

Verified end-to-end via local-auto-test/curate-html-e2e/ harness:
- 3/3 fixtures (bug-fix, rule-decision, general-context) curate to
  registry-valid HTML topic files (Sonnet 4.5).
- File paths land at <projectDir>/.brv/context-tree/<path>.html as
  expected.
- Per-fixture latency: 19.3s, 30.5s, 27.3s.

Pipeline: 7878 unit tests pass, lint clean, typecheck clean, build
clean.
…e time

The previous markdown writer rendered `createdAt` and `updatedAt` in
the YAML frontmatter of every topic file. The HTML writer was dropping
both — the LLM doesn't author them and the writer wasn't injecting
them either, so HTML topics had no temporal metadata on disk.

Inject both as `<bv-topic>` attributes during the writer's atomic
write, after fence-stripping and registry validation:

- `updatedat` is always set to the current ISO-8601 instant.
- `createdat` is preserved from the existing file on disk if one
  exists at the resolved path; otherwise it is set to the current ISO
  instant. Re-writes leave `createdat` alone, advance `updatedat`.
- Any value the LLM authored for these attributes is overridden — the
  agent is not allowed to choose its own timestamps.

Implementation: surgical regex edit on the bv-topic opening tag (no
parse → re-serialize round-trip) so the LLM's formatting (whitespace,
attribute order, quoting style) survives intact. Two helpers added to
html-writer.ts:
- setBvTopicAttributes(html, attrs) — replace-or-insert each attr.
- readExistingTopicAttribute(filePath, name) — surgical read of an
  attribute value from an existing on-disk topic without parsing the
  whole document.

Tests:
- Existing assertions about file content equaling VALID_TOPIC updated
  to match-with-injected-timestamps (body preserved verbatim, bv-topic
  opening tag now has createdat/updatedat).
- Three new cases under "system-managed timestamps":
  1. First write injects both timestamps within the test window.
  2. Re-writes preserve createdat, advance updatedat.
  3. LLM-authored values for either attribute are overridden by the
     system value.

Verified end-to-end via local-auto-test/curate-html-e2e/ — all 3
fixtures produce HTML topic files with system-injected createdat and
updatedat ISO-8601 timestamps on the bv-topic root, alongside the
LLM's frontmatter (path/title/summary/tags/keywords/related).

Pipeline: 18 html-writer unit tests pass (3 new), full suite green,
typecheck clean, lint clean.
CI test failure (curate-executor-html-mode.test.ts:79) — the file-content
assertion still expected the LLM's raw HTML; now matches the
timestamp-injected file by checking the body is preserved verbatim and
that createdat/updatedat are present. Same fix in the fence-stripping
test variant.

Review feedback addressed:

1. Telemetry format hardcoded to 'markdown' (curate-executor.ts:485).
   The flag is gone; HTML is the only curate format. Updated to emit
   'html' and reworded the doc comment. Updated the regression test in
   curate-executor.test.ts:491 to assert 'html' accordingly.

2. validateHtmlTopic does not catch path-traversal (html-writer.ts).
   The doc comment promised "safe to use the topic" but a payload like
   `<bv-topic path="../../../etc/passwd">` returned `{ok: true}` —
   path-traversal was only caught later when topicPathToFilePath threw.
   That works end-to-end (executor catches and surfaces as failed
   status), but is a foot-gun for any future caller using
   validateHtmlTopic standalone (preview, dry-run).

   Move the path-segment safety check into validateHtmlTopic and
   surface as a new structured error kind 'unsafe-path'.
   topicPathToFilePath keeps path.resolve as defence-in-depth. Added
   two unit tests covering `..` segments and single-`.` segments;
   updated the existing executor-side test to expect a validation
   failure (with `unsafe-path` errors) instead of a thrown exception.

3. Attribute-validation errors swallowed the `field` discriminant
   (curate-executor.ts:344). The writer captures which attribute
   failed (`severity`, `id`, etc.) but the executor's
   verification.missing string only included the tag. So a malformed
   `severity="urgent"` was logged as `attribute-validation
   (bv-rule): …` rather than `attribute-validation (bv-rule.severity):
   …`. Updated the mapping to include the field for
   attribute-validation errors. Non-blocking but improves bench-log
   debuggability.

Reviewed but obsolete (not actioned):

- "format-detector.ts ships an unused getFormatForWrite" — removed in
  the flag-drop commit f701cd2.
- "agent-process.ts double-coerces useHtmlContextTree" — also obsolete
  (flag dropped).

Pipeline: 7883 unit tests pass, lint clean, typecheck clean, build
clean. E2e smoke (curate-html-e2e/) green on the rebuilt CLI.
[ENG-2739] curate path — HTML writer + format detector + executor wiring
…ch routing

Wires the search/query path to read HTML topic files transparently
alongside markdown ones. Curate emits HTML; the legacy MD path stays
in for `brv swarm` and existing trees that haven't migrated. The BM25
index is format-agnostic (HTML inner-text decodes to the same plain
text the markdown tokenizer expects); the indexer dispatches on file
extension at read time.

New modules:

- src/server/infra/render/reader/html-reader.ts — readHtmlTopicSync(html)
  returns {bodyText, elements, topicAttributes}. bodyText is the
  entity-decoded inner text, ready to feed BM25. elements is a flat,
  document-order list of every typed `<bv-*>` node with its attribute
  map. topicAttributes lifts the bv-topic frontmatter (path, title,
  summary, tags, keywords, related). Forgiving on malformed input
  (parse5 returns a best-effort tree).

- src/server/infra/render/reader/element-axis-index.ts — in-memory
  ElementAxisIndex. Two query keys:
    findByTag(tag) → paths
    findByAttribute(tag, attr, value) → paths
  Reverse path→keys map keeps invalidation O(keys-touched) instead of
  full-scan. Built in lockstep with the BM25 index; same lifetime,
  same invalidation triggers. Today's consumer is the optional
  `elementHint` filter on SearchOptions; the structural-selector
  grammar will hook in here without further reshaping.

Search service changes (search-knowledge-service.ts):

- File discovery: glob extended from `**/*.md` to `**/*.{html,md}`.
- Indexing: new readIndexableContent() helper dispatches on
  getFormatForRead(filePath). HTML files route through readHtmlTopicSync
  (the BM25 input is bodyText, not raw markup) and contribute their
  element list to the structural-axis index. MD files index their raw
  content as before.
- IndexedDocument gains a `format: 'html' | 'markdown'` field, set
  from the file extension at index time and surfaced on every
  SearchKnowledgeResult entry. Telemetry/query-log consumers
  (T5-shipped QueryLogEntry.format) populate from this without
  format-detection at the call site.
- CachedIndex gains `elementAxisIndex: ElementAxisIndex`, populated
  during buildFreshIndex from the per-document elements walk. The
  empty-tree fallback also constructs an empty index so callers can
  treat the field as non-nullable.
- SearchOptions gains `elementHint?: ElementHint`. When present, the
  search composes a MiniSearch filter that intersects scope +
  element-axis matches before BM25 ranking. No caller passes the hint
  today; M2 selector grammar is the consumer.
- INDEX_SCHEMA_VERSION bumped to 6 so previously-cached indexes built
  against MD-only schema rebuild on next query.
- Title resolution: HTML files use the bv-topic title attribute; MD
  files retain the existing extractTitle frontmatter walk.
- The structural-selector hint is also exposed on the public
  SearchKnowledgeOptions as `SearchKnowledgeElementHint` — wired
  through ToolsSDK so SDK consumers can pass it without going through
  the internal SearchOptions surface.

Tests:

- html-reader (9 cases): bodyText extraction, entity decoding,
  attribute lift, elements-list document order + de-duplication,
  unknown-bv-* skipping, missing bv-topic graceful behaviour,
  malformed input non-throw.
- element-axis-index (10 cases): findByTag/findByAttribute, empty-array
  for unknown tags, case-sensitivity on values, single-file multiple-
  contribution dedup, remove() drops all keys + cleans empty key sets,
  remove() on unknown path no-op, clear().
- search-knowledge-service-html-routing (8 cases): HTML files indexed,
  format field populated for both HTML and MD, raw markup is NOT
  searchable (inner-text only), bv-topic title attribute lifts as
  doc title, elementHint pre-filter (no-match returns empty,
  tag-only restricts, tag.attr=value restricts further).

Pipeline: 7910 unit tests pass (27 new), typecheck clean, lint clean
(0 errors), build clean.

Out of scope (per the M1 plan):
- Structural selector grammar — M2 (this PR ships the substrate).
- findReferences reverse-graph — M2.
- Signal-precise cache invalidation — mtime-based today is sufficient.
- BM25 ranking parity test (MD vs HTML same content) — M2 bench
  measures end-to-end; the routing tests above cover the
  determinism boundary.
- HTML BM25 input now concatenates `<bv-topic>` summary/tags/keywords/related
  into the indexable content. Without this, a term living only in those
  attributes ranked HTML topics far below the markdown equivalent (where
  YAML frontmatter is fed verbatim into BM25).
- HTML title fallback uses `?.trim() || fallbackTitle` so a `title=""` or
  whitespace-only attribute doesn't leak into BM25's 3x-boosted title field.
- `ElementHint` (service + tools-sdk) becomes a discriminated union so
  `attribute` and `value` must be supplied together at compile time.
- `ElementAxisIndex` storage swaps the stringly-keyed
  `${tag}.${attr}=${value}` table for nested Maps — eliminates the
  `.`/`=`-in-attribute collision class without a delimiter discipline.
- `html-reader` lifts attributes off the FIRST `<bv-topic>` encountered
  unconditionally rather than the first non-empty one (matches the
  contract on malformed input).
- Glob discovery does two parallel passes (`**/*.html`, `**/*.md`)
  instead of relying on the underlying engine to expand `{html,md}`.

Tests: +7 (BM25 attribute-payload parity, title-fallback, FS-backed
reader round-trip, topic-attribute lift contract, attribute-name/value
collision robustness x3). All 7917 pass; lint + typecheck clean.
[ENG-2740] query path — HTML reader + element-axis index + search routing
…sponse

Tier 2 (`brv query`'s direct-search response — no LLM) reads the full
file at `query-executor.ts:701` and ships the bytes back to the user
through `formatDirectResponse`. For HTML topics that's raw
`<bv-topic ...><bv-rule severity="must">...</bv-rule>...` markup.

Two consequences observed during the M1 multi-tier audit:

  1. The 5000-char content budget in `direct-search-responder.ts:11`
     burns on tags and attribute syntax before reaching meaningful
     text. Long topics get truncated mid-element.
  2. Any LLM-shaped consumer downstream (Tier 4 agent re-reads, the
     bench harness, third-party integrations) re-parses the HTML.
     Wasted tokens; lossy semantics.

Stripping every tag (bodyText only) loses the typed-element signals
the format exists to carry — severity on rules, subject/value on
facts, id on decisions. New `renderHtmlTopicForLlm(html)` walks the
parsed tree and renders each `<bv-*>` element with a semantic prefix:

  - `<bv-rule severity="must" id="r-1">x</bv-rule>` →
    `- **Rule** [must] (r-1): x`
  - `<bv-fact subject="alg" value="RS256">…</bv-fact>` →
    `- **Fact** (subject=alg, value=RS256): …`

Wired into `tryDirectSearchResponse`: when `result.format === 'html'`
the full content goes through the renderer; markdown topics pass
through verbatim. Renderer is forgiving on malformed input (mirrors
the rest of the parse5-driven reader pipeline); on any throw we fall
back to raw bytes so a single bad topic can't blank the response.

Tests: +15 (12 renderer, 2 Tier 2 HTML branch, 1 Tier 2 markdown
regression guard). 7931 pass; lint + typecheck + build clean.
fix: Tier 2 direct response renders HTML topics into structured markdown
Post-HTML-migration, every QueryLogEntry on disk reports
`format: "markdown"` regardless of whether the recall touched HTML or
markdown topics. Telemetry consumers (query-log summaries, analytics
dashboards, the brv-bench harness) cannot distinguish HTML vs markdown
traffic — every chart flat-lines as "all markdown" even on a fully
migrated tree.

Root cause: T5 (ENG-2741) introduced `MarkdownOnlyFormatDetector` as a
deliberate stub with the explicit comment "swap when the format-detector
task lands the real extension-aware detector." T3 (ENG-2739) shipped
`format-detector.ts` for read-routing inside `search-knowledge-service`
but did not swap the executor's binding. The stub stayed wired as the
default in `QueryExecutor`.

This change:

- Adds `ExtensionAwareFormatDetector` — inspects each
  `matchedDoc.path` extension and returns `'html'` if any HTML doc was
  retrieved, `'markdown'` for legacy-only recalls, `undefined` for
  empty recalls (cache hit, OOD short-circuit, tier 4 LLM-only).
- Single-`'html'` policy is deliberate: any HTML doc is the
  load-bearing signal for telemetry rollups in a partially-migrated
  tree. Reporting `'markdown'` for a mixed result would hide HTML
  traffic from cost / coverage dashboards.
- Strips shared-source `[alias]:` prefixes before extension parsing so
  shared-source paths classify the same as local.
- Defaults to `'markdown'` for extensionless paths (defensive).
- Wires as the default `formatDetector` binding in `QueryExecutor`.
- Keeps `MarkdownOnlyFormatDetector` for tests that pin pre-migration
  semantics; updates its docstring + comment to reflect the new role.

Tests: +9 (extension-aware detector full coverage). 7940 pass; lint
runs through the pre-existing submodule config issue, otherwise clean.
ByteRover-provider runs produce QueryLogEntry records with no
`inputTokens` / `outputTokens` / `cachedInputTokens` /
`cacheCreationTokens` even though Tier 3 calls take seconds of llmMs.
Telemetry consumers cannot compute per-query cost.

Root cause: T5's telemetry pipeline depends on
`GenerateContentChunk.rawResponse` being set on the terminating
chunk so `LoggingContentGenerator.generateContentStream()` can
capture the last non-undefined `rawResponse` and feed it to
`pickRawUsage()` (`.usage ?? .usageMetadata`). The AI-SDK content
generator wires this on its `'finish-step'` case but
`ByteRoverLlmHttpService.extractContentFromResponse()` (the simulated
stream backing every `ByteRoverContentGenerator` call) yielded chunks
without ever attaching the backend response. Result: `lastRawResponse`
stays undefined, no `llmservice:usage` event fires, no tokens land on
the log entry.

Fix: attach the full `GenerateContentResponse` as `rawResponse` on the
terminating chunk in both branches of `extractContentFromResponse` —
empty-content and full-content paths. The `pickRawUsage` helper
already handles the Gemini-shaped (`usageMetadata`) and Claude-shaped
(`usage`) payloads the ByteRover backend returns through this code
path, and the existing `extractUsage('google'|'anthropic')` mappers
take it from there.

Tests: +3 (terminating-chunk rawResponse coverage across content,
empty-parts, and function-call termination cases). 7943 pass.
The companion prompt loaded into every curate-mode invocation
(`curate-detail-preservation.yml`, auto-discovered by
`companion-contributor.ts:discoverCompanionPrompts('curate')` against
the `curate-*.yml` glob) still instructed the agent to "store in
narrative.rules / narrative.highlights / narrative.diagrams /
narrative.structure / narrative.dependencies / narrative.examples /
rawConcept.flow / content.facts / snippets" — every one of which is a
JSON-schema field on the pre-T3 curate-tool API that no longer exists.
T3 (ENG-2739) dropped `useHtmlContextTree` and made HTML the sole
emission format, but T2 (ENG-2738, the HTML-emission contract) only
updated `curate.txt` and missed this companion file.

Empirically observable on the proj/html-mem-conversion branch: a
LoCoMo-bench curate against `conv_26/session_1` produced a `<bv-topic>`
with one giant `<bv-rule>` element wrapping a markdown narrative —
the agent saw "store in narrative.rules" and treated `<bv-rule>` as
the closest match. Other sessions in the same run emitted clean
typed-element shapes; the variance correlates with whether the
legacy companion prompt dominated that turn's context.

This change rewrites the file so every preservation rule maps to the
correct `<bv-*>` element from the closed vocabulary
(`bv-rule`, `bv-task`, `bv-fact`, `bv-decision`, `bv-bug`, `bv-fix`,
`bv-flow`, `bv-diagram`, `bv-highlights`, `bv-structure`,
`bv-dependencies`, `bv-examples`, `bv-pattern`). General preservation
principles (completeness over conciseness, exact preservation, never
summarize) are kept verbatim. Adds an explicit guard at the end of
the prompt telling the agent NOT to use `narrative.*` / `rawConcept.*` /
`content.*` JSON-schema field names so any residual model bias toward
the pre-migration API surfaces is suppressed.

Tests: no test fixture asserts on this file's contents (it's a runtime
agent prompt, exercised via end-to-end curate runs). YAML still parses;
full suite green at 7943.
Two items from the review pass on this branch:

  1. `extractContentFromResponse` had three terminating-chunk paths;
     only two received `rawResponse` in the original fix. The third
     branch — `candidates` array missing or empty (line 193) — fires
     when Gemini blocks output via safety filter or Claude returns a
     refusal-shaped response. Both providers populate `usageMetadata`
     / `usage` on those responses, so dropping `rawResponse` here
     silently loses billable tokens on every safety-block /
     refusal turn. Trivial completion: forward `response` from this
     branch too, matching the other two. New test pins the contract.

  2. JSDoc rewrite on `stripSharedAlias` to justify why the helper is
     retained. Today `path.extname` happens to handle
     `[alias]:rel/path.html` correctly (the colon lives in the dirname,
     not the basename), so the strip is defensive rather than
     load-bearing. The helper stays for two reasons noted in the
     docstring: keeping the detector independent of
     `getFormatForRead`'s internals (so a future switch to URL-style
     parsing doesn't silently break the alias case), and making the
     alias-stripping branch testable in isolation. Prevents a future
     reader from deleting it as apparent dead code.

Tests: +1 (candidates-empty terminating-chunk rawResponse contract).
7944 pass.
fix: extension-aware format detector + byterover tokens + curate prompt
…aceholder

First milestone of byterover-tool-mode: ByteRover stops being an agent
for curate/query and becomes a tool the user's primary agent calls. The
user's LLM (Claude Code, Cursor, etc.) supplies completions; byterover
validates and persists. No provider config inside byterover.

This commit ships the wire-protocol surface for the curate side. The
orchestrator behind it is a placeholder — TKT 02 lands the real state
machine; TKT 03 fills in real prompts; TKT 04 authors the SKILL.md
that drives the loop from Claude Code.

Wire contract (documented at docs/curate-protocol.md):

  Kickoff (env-gated):
    BRV_CURATE_TOOL_MODE=1 brv curate "<text>" --format json
    → {status: needs-llm-step, sessionId, step: generate-html, prompt}

  Continuation (--session implies tool mode):
    brv curate --session <id> --response "<text>" --format json
    → {status: done, filePath} | {status: failed, errors[]}

Without the env var, brv curate runs today's legacy agent-driven path
unchanged — fully additive. The env var is the temporary opt-in; TKT 02
replaces it with a BrvConfig field as part of the real orchestrator.

Placeholder semantics:
  - Kickoff always returns one `needs-llm-step` with a stub prompt
  - Continuation always returns `done` on first valid response
  - State on disk at `.brv/sessions/curate-<id>/state.json` (TKT 02
    moves into daemon task-session sandbox vars when search-service
    integration lands)
  - Empty response → `failed` with kind=empty-response (session kept
    live for retry); unknown sessionId → `failed` with
    kind=unknown-session (terminal)

Tests: 12 placeholder unit tests covering kickoff, continuation
success/failure paths, on-disk state lifecycle, and envelope-contract
guarantees per the protocol doc. Manual round-trip smoke verified the
CLI happy path end-to-end.
DatPham-6996 and others added 2 commits May 27, 2026 12:21
- Add dream.md to managed ByteRover skill files
- Add regression coverage for skill template enumeration
- Fail YAML MCP writes when existing config is malformed or non-mapping
- Preserve existing YAML bytes instead of overwriting with a fresh config
feat: [ENG-2697] rewrite ByteRover skill — sub-skill split + Hermes connector
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Claude finished @cuongdo-byterover's task in 8m 27s —— View job


Re-review of PR #663 after the ENG-2697 batch (PR #675)

Three new merges since be740ea (the prior review's baseline): PR #708 (VC-sync hint — reviewed in the May 26 cycle), PR #675 (ENG-2697 — skill restructure + Hermes/OpenClaw + onboarding tour + autonomous attachments, ~9 commits, +3275/-783 LOC), and b49d3416 (a follow-up fix preserving malformed YAML configs during MCP install).

The ENG-2697 batch is large but well-scoped: monolithic SKILL.md split into 10 sub-skill files (SKILL.md + onboarding + query/curate/review/swarm/vc/dream/history/troubleshooting); two new agents land first-class (Hermes via skill + YAML MCP, OpenClaw via skill + AGENTS.md attachment); a shared AgentPathResolver honors HERMES_HOME/OPENCLAW_* env overrides with clean dependency injection; an autonomous-attachment patcher upserts a managed BYTEROVER RULES block into Hermes SOUL.md and per-workspace OpenClaw AGENTS.md files with a byte-comparison freshness check that lets status() distinguish present-but-stale from clean install.

Score: 4.25 / 5 — net regression from 4.75/5. The infrastructure work is high-quality; the connector-manager-attachment freshness pattern is excellent contract design. But the skill template content has not kept pace with ENG-2880/ENG-2884 removals: curate.md, history.md, and onboarding.md actively instruct agents to use -f, --detach, and verify-by-logId patterns that the runtime now rejects. Because the SkillConnector writes these files to every supported agent on brv connectors install, on day 1 of any new install the agent will follow broken advice — the highest-impact regression in the PR.

What changed since be740ea

Commit(s) Scope
f5e92f5b ENG-2697 — split SKILL.md → 8 sub-skills; add Hermes (skill + YAML MCP) + OpenClaw (skill); shared agent-path-resolver + skill-path-resolver; autonomous-agent-attachments upserts managed block into Hermes SOUL.md / OpenClaw AGENTS.md; extract...FromBrvSection footer pattern preserves multi-agent AGENTS.md sharing
eafd498d ENG-2697 — add curate-judgement.md post-write self-review skill + 4-dimension Quality Bar in curate.md
f2bbbf36 ENG-2697 — backfill missing brv commands in vc/swarm/review/history/troubleshooting sub-skills
cb2cd042 ENG-2536 — 90-second guided onboarding.md skill, registered in SKILL_FILE_NAMES, gated by First-Turn Routing in SKILL.md
2baf2724 Onboarding tour Msg 3 surfaces vc/sync/cross-agent controls
6733e84e dream.md brought back in line with the post-ENG-2884 tool-mode dream subsystem
b49d3416 YAML MCP writer fails on malformed/non-mapping YAML (was silently overwriting user's broken config); adds dream.md to SKILL_FILE_NAMES; new test enumerates every template .md against the install set

Strengths

  • Block-freshness contract (rule-segment-patcher.ts:244-258). hasByteroverBlock(path, expectedBlock?) returns true only when an on-disk block matches the expected bytes — letting status() report "present but stale" as "not installed" so a same-type re-install can repair it. The companion test at connector-manager-attachment.test.ts:51-72 locks the invariant. Most teams stop at marker presence; this lifts the check one level further to "marker present AND content current," which is exactly what's needed for upgrades and hand-edits.
  • b49d3416 malformed-YAML preservation. Previously, a user with a hand-rolled but invalid Hermes config.yaml would have it silently overwritten with a fresh single-server config. Now write() throws with a structured "Cannot update YAML MCP config at " message, and the bytes are preserved. The two instanceOf(Error) assertions in yaml-mcp-config-writer.test.ts:176-198 exercise both malformed and non-mapping cases and assert the original content is unchanged afterward.
  • OpenClaw workspace-discovery faithfulness (autonomous-agent-attachments.ts:95-149). AGENTS.md lives in OpenClaw's workspace dir per loadWorkspaceBootstrapFiles, not the agent dir or state dir. Most teams would have shipped a silently-not-loaded AGENTS.md; this resolver mirrors OpenClaw's own resolveAgentWorkspaceDir / agent-id sanitization rules so the managed block lands where OpenClaw actually reads it. Subagent allowAgents discovery + dedup via Set rounds it out.
  • Injectable env/homeDir on the SkillConnector. pathResolverOptions() threads env / homeDir through every path resolver, letting tests inject without touching process.env. The 4-case Hermes test triad at skill-connector.test.ts:386-486 covers install / clean install / stale block / repair-on-re-install entirely via injection.
  • Multi-agent footer preservation. extractInstalledAgentFromBrvSection in shared/constants.ts:49-62 and the multi-agent AGENTS.md sharing pattern (Amp / Codex / OpenCode all map to AGENTS.md) is preserved through the autonomous-attachment changes — the agent-name footer inside the BRV markers still disambiguates ownership.

Issues

1. CRITICAL — skill templates instruct agents to use removed flags/concepts

Posted inline at curate.md:28-46 and history.md:8. Three classes of staleness, all of which the SkillConnector ships verbatim to every supported agent's skill directory:

  • -f / --files (removed in ENG-2880, see CURATE_REMOVED_FLAGS in removed-flags.ts:73-91):
    • curate.md:28 example: brv curate "..." -f src/retry.ts
    • onboarding.md:239 Path A: brv curate -f <path> (with the giveaway parenthetical "or whatever the project-file curate flag is")
  • --detach (also removed in ENG-2880):
    • curate.md:35-46 — entire "Execution Mode" section telling agents when to use --detach
    • curate.md:152 — Common Mistakes row about "detached curate work"
    • history.md:8, 12, 46, 48, 58 — five references to "detached curate" / verifying after the fact
  • logId returned by brv curate (gone in ENG-2884; tool-mode curate returns sessionId, not logId):
    • curate.md:42-45: "If a detached curate returns a log id, do not claim it is saved until this verifies completion: brv curate view <logId> --format json"
    • history.md:46: "Use brv curate view <logId> --format json when verifying a detached curate"

Impact: on a fresh brv connectors install <agent>, the agent reads these instructions and tries to run brv curate "..." -f path or brv curate "..." --detach — both produce a migration error from removed-flags.ts. The onboarding tour itself (Path A at onboarding.md:239) hits this on its second-to-last message, breaking the first-impression user experience for the agent that just installed the connector.

This is the highest-impact finding of this batch. Two clean paths:

  • Recommended: scrub all three classes from curate.md, history.md, and onboarding.md. Replace -f examples with inline-content curate strings, drop the "Execution Mode" section entirely (tool-mode curate is sync; the detach distinction is gone), and replace logId with sessionId where the post-curate context is the session protocol.
  • Alternative: add a unit test that scans every template .md for substrings from CURATE_REMOVED_FLAGS.flags (['--detach', '--files', '-f', '--folder', '-d', '--timeout']) and fails if any appear. Locks the contract durably so future flag removals don't drift through the skill files.
2. issue — Hermes workdir note is hardcoded path, ignores HERMES_HOME

Posted inline at curate.md:12 (also at query.md:10). The line tells the agent: "For hermes agent, run brv commands at workdir="~/.hermes/byterover/"" — but the actual install path is <HERMES_HOME>/skills/byterover/ when HERMES_HOME is set. Two more nits: double-space typo, and asymmetric coverage (OpenClaw has the same custom-root pattern but no equivalent note). If a Hermes-specific workdir convention is required, it belongs in byterover-rules-block.md (which gets injected into SOUL.md), not duplicated inline in 2 of 10 sub-skill files.

3. suggestion — McpConnector lacks the injection seam SkillConnector has

Posted inline at mcp-connector-config.ts:205. configPathResolver: () => path.join(resolveHermesHome(), 'config.yaml') calls the resolver with no options, so tests have to mutate process.env.HERMES_HOME directly (mcp-connector.test.ts:484-494). The SkillConnector takes injection options on its ctor — the asymmetry forces global-state mutation in tests, which can race under concurrent runs. Lift configPathResolver to accept AgentPathResolverOptions and thread through McpConnector's ctor for parity.

4. observation — YamlMcpConfigWriter malformed-YAML handling is asymmetric

Posted inline at yaml-mcp-config-writer.ts:65-86. b49d3416 correctly fixed write() to throw + preserve on malformed YAML, but exists() and remove() still silently treat malformed YAML as "no install." brv connectors status reports "not installed" when the user actually has a broken config — masking the real failure mode. The write() fix was the load-bearing one; the symmetry fix is non-blocking but worth a follow-up.

5. observations (carry-overs from prior reviews, unchanged) — non-blocking
  • Error messages still reference removed brv providers connect/switch commands (daemon-client.ts:36, 38, 41, 181-182, 274-277, task-validation.ts:26, 36, tui/utils/error-messages.ts:17-21). Reachability is near-zero post-ENG-2925 (all five user-facing tool-mode types exempt from validateProviderForTask), but OAuth refresh failure paths on brv push/pull/cloud-sync commands are still live and still emit the wrong advice. Suggest brv login as the replacement.
  • <existing-topic> block in curate-prompt-builder.ts:119-123 still inlines system-managed createdat/updatedat verbatim. Now exercised more often by tool-mode dream MERGE candidates calling the UPDATE path under meta.previousSummary.
  • topic-loader.ts:43, 110-155 regex parser still inconsistent with index-generator.ts:208-222 parse5 — entity-encoded title attributes diverge.
  • CLI ConsoleLogger chalk-on-stderr under --format json at curate-session.ts:380, 423.

Praise (posted inline)

  • rule-segment-patcher.ts:238-258hasByteroverBlock(path, expectedBlock?) byte-comparison freshness check is the right shape for upgrade-aware idempotency. Pairs with the connector-manager-attachment.test.ts:51-72 "same-type re-install repairs stale block" test to lock the contract durably.
  • autonomous-agent-attachments.ts:95-149 — OpenClaw workspace-discovery faithfully mirrors OpenClaw's own loadWorkspaceBootstrapFiles + agent-id sanitization, so blocks land where OpenClaw actually reads them. The allowAgents: ['*'] filter and Set-based dedup are the kind of upstream-config-language edge case most ports miss.

Test plan suggestions

  • A unit test that scans every file in src/server/templates/skill/*.md for substrings from CURATE_REMOVED_FLAGS.flags and QUERY_REMOVED_FLAGS.flags and asserts the count is zero. Would catch the curate.md -f / --detach drift automatically and prevent the same regression on the next flag removal.
  • An end-to-end test that runs brv connectors install <agent> against a tmpdir, reads the installed curate.md, then asserts the file doesn't contain '-f ' or '--detach' or '<logId>'. Adds a layer of coverage from the install path, not just the template source.
  • A test for Hermes status under a non-default HERMES_HOME that asserts the configPath in the status response matches the resolved path (locks issue 3 if you add the injection seam).

Final disposition

LGTM with the skill-template fix as a required follow-up. The ENG-2697 infrastructure work (autonomous attachments, path resolvers, YAML preservation) is high-quality and well-tested. But the stale -f / --detach / logId references in curate.md / history.md / onboarding.md actively break the first-impression UX for any agent that installs the skill connector after this PR lands — and the fix is just text edits in 3 markdown files. Worth resolving before merge or as a tight follow-up. Carry-over items are all known and have consistently been polished post-merge.

Todo

Comment thread src/server/templates/skill/curate.md Outdated
Comment on lines +28 to +46
brv curate "Retry helper treats HTTP 429 as retryable with exponential backoff." -f src/retry.ts
brv curate view --detail
brv review pending --format json
```

## Execution Mode

Default behavior is blocking. Run `brv curate "..."` with no flag and wait until it completes before using, querying, or reporting the newly curated data.

Use `--detach` only when both conditions are true:

1. No remaining step in this turn depends on the curated data.
2. The user explicitly told you not to wait, for example "don't wait", "don't block", "fire and forget", or "move on without waiting".

If a detached curate returns a log id, do not claim it is saved until this verifies completion:

```bash
brv curate view <logId> --format json
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (critical — skill template instructs agents to use removed flags): This file now ships to every supported agent (Claude Code, Cursor, Codex, Hermes, OpenClaw, etc.) via the SkillConnector, but three sections actively tell the agent to use flags + concepts that no longer exist:

  1. Line 28 — example uses -f src/retry.ts. --files / -f was removed in ENG-2880 (CURATE_REMOVED_FLAGS in src/oclif/lib/removed-flags.ts:73-91). An agent that copy-pastes this example gets the migration error: "--files / -f: This flag has been removed. Tool-mode curate pre-packs file content into the bv-topic HTML directly."

  2. Lines 35-46 — entire "Execution Mode" section about --detach. Also removed in ENG-2880 (removed-flags.ts:85-88). Any agent following the documented two-condition rule will hit the migration error on the first --detach invocation. The "If a detached curate returns a log id" path is unreachable.

  3. Line 45 / line 152 — brv curate view <logId>. Tool-mode brv curate does NOT return a logId anymore (it returns a sessionId for the session protocol). The <logId> concept came from the legacy LLM-driven curate-executor which was deleted in ENG-2884. brv curate view itself still exists, but agents trying to use the sessionId returned by tool-mode curate as a logId will get "not found".

This is the highest-impact finding of this batch — the skill connector ships these files to every supported coding agent's skill directory on brv connectors install, so on day 1 of any new install the agent follows broken advice.

Suggested fix:

  • Delete the entire "Execution Mode" section (lines 33-46).
  • Replace line 28's example with one that doesn't use -f (just inline the file content into the curate string).
  • Drop the "detached curate" row from the Common Mistakes table (line 152).
  • Verify history.md doesn't have the same issue (it does — history.md:8, 12, 46, 48, 58 all reference detached curate).

Fix this →


No ByteRover LLM provider is required. The calling agent's own LLM authors the HTML response.

***HIGHLY IMPORTANT***: For hermes agent, run brv commands at `workdir="~/.hermes/byterover/"`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (Hermes-specific workdir note is wrong when HERMES_HOME is set): This line (also at query.md:10) tells the agent the working dir is ~/.hermes/byterover/. Two problems:

  1. Path is wrong under HERMES_HOME. The SkillConnector installs Hermes skills at <HERMES_HOME>/skills/byterover/ (via resolveHermesHome()'s env override at agent-path-resolver.ts:97-104) — so a user with HERMES_HOME=/custom/path set will find the skills at /custom/path/skills/byterover/, not ~/.hermes/byterover/. The note is hardcoded.

  2. Asymmetric coverage. Only Hermes gets a workdir note; OpenClaw has the same custom-root pattern (globalRoot: 'openclaw-state') and the same autonomous-attachment mechanism, but no equivalent instruction.

  3. Typo (minor): double space in commands at \workdir``.

If a Hermes workdir note is required for the agent's working-directory plumbing, it belongs in the always-loaded byterover-rules-block.md (which gets injected into Hermes SOUL.md and OpenClaw AGENTS.md per workspace) — that's the right surface for per-agent runtime conventions. Duplicating it inline in curate.md + query.md (and not in the other 7 sub-skill files) is fragile.

Suggested fix:

  • Either move the instruction into byterover-rules-block.md with HERMES_HOME-aware path resolution, or
  • Drop the inline note entirely and let the agent infer the workdir from its own filesystem context.
Suggested change
***HIGHLY IMPORTANT***: For hermes agent, run brv commands at `workdir="~/.hermes/byterover/"`.

Fix this →


> "One more thing — your persona is saved, but I don't know your codebase yet. I see `CLAUDE.md` in your repo. Want me to curate it as starter project context? Takes about 30 seconds."

If the user says yes, run `brv curate -f <path>` (or whatever the project-file curate flag is) and confirm with one line.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (the onboarding tour will fail at Path A): Two problems with this line:

  1. brv curate -f <path> won't work. --files / -f was removed in ENG-2880 (see CURATE_REMOVED_FLAGS in removed-flags.ts:73-91). The first-impression curate call the tour makes on a fresh .brv/ will hit the migration error and the tour collapses at the most user-visible step.

  2. "or whatever the project-file curate flag is" is a tell — the author wasn't sure what the current flag is at write time. That uncertainty made it into the shipping skill file, where the agent reads it.

Path A is the documented happy path for a fresh user installing on a project with an existing CLAUDE.md / AGENTS.md / README.md. If it fails on the first try, the user's first impression of ByteRover is "the tour doesn't work."

Two clean fixes:

  • Recommended: Read the file content yourself and pass it inline as the curate context: brv curate "$(cat <path>)" (or the tool-mode equivalent — the calling agent owns this). Aligns with the post-ENG-2837 model where the agent pre-packs file content into the bv-topic HTML.
  • Alternative: Replace Path A entirely with Path B (the prompt-for-one-rule path) — it has no flag dependency.
Suggested change
If the user says yes, run `brv curate -f <path>` (or whatever the project-file curate flag is) and confirm with one line.
If the user says yes, read the file yourself and curate its content: `brv curate "<one-line description>: $(cat <path>)"` (or the tool-mode session form). Confirm with one line.

Fix this →

Comment thread src/server/infra/connectors/mcp/mcp-connector-config.ts
Comment thread src/server/infra/connectors/mcp/yaml-mcp-config-writer.ts
Comment thread src/server/infra/connectors/shared/rule-segment-patcher.ts
Comment thread src/server/infra/connectors/skill/autonomous-agent-attachments.ts
Comment thread src/server/templates/skill/history.md Outdated

# ByteRover History

Use history commands to audit what ByteRover queried, curated, or queued. History is especially important before trusting detached curate work from a previous step.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (cascades from the curate.md "detached curate" finding): This file has 5 references to "detached curate" / --detach:

  • Line 8 (here): "History is especially important before trusting detached curate work from a previous step."
  • Line 12: "You need to verify a detached brv curate completed."
  • Line 46: "Use brv curate view <logId> --format json when verifying a detached curate before relying on it."
  • Line 48: "If a detached curate is still processing, wait or tell the user it is not complete."
  • Line 58 (Common Mistakes row): "Trusting a detached curate without checking status."

All five are dead concepts post-ENG-2880 (no --detach) and ENG-2884 (no logId). The file remains useful for brv curate view introspection and brv query-log inspection, but the "verify detached curate" framing is stale.

Suggested rewrite for the opening paragraph:

Use history commands to audit what ByteRover queried, curated, or queued. Useful for the user asking "what was decided" or for confirming a recent curate's final shape.

And drop lines 12, 46, 48, 58 entirely (or recast as "verify a completed curate" — but if completion is synchronous in tool mode, there's nothing to verify after the fact).

Fix this →

- Remove redundant execution-mode guidance from the curate skill template
- Reframe history guidance around confirming recent curate output
- Update onboarding to curate starter docs from file contents directly
docs(skill): align curate guidance across templates
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Claude finished @DatPham-6996's task in 5m 36s —— View job


Re-review of PR #663 after dd56e2cc ("docs(skill): align curate guidance across templates")

One PR-specific commit since the prior review baseline (11adb44a): dd56e2cc, merged via PR #717. It addresses the most user-impactful chunk of the critical finding from the May 27 review — the onboarding tour's Path A (brv curate -f <path>) would have failed immediately on a fresh install. The fix is exactly the post-ENG-2837 pattern, almost word-for-word matching the suggested edit. The "Execution Mode" section (the largest stale block in curate.md) is also gone, and history.md's opening paragraph was reframed.

But the fix is partial: four "detached curate" references in history.md's body (lines 12, 46, 48, 58), the -f src/retry.ts example in curate.md's Quick Reference (line 28), and the Common Mistakes row in curate.md (line 137) all still ship to every supported agent on brv connectors install. An agent reading those files linearly today gets a coherent opener, then hits stale framing.

Score: 4.5 / 5 — net improvement from 4.25/5. The biggest first-impression failure mode is fixed; the remaining items are skill-doc fragments that an agent reading top-to-bottom might still trip on, but they're less likely to be hit on day 1 than the onboarding tour was.

What changed since the prior review

Commit Scope
dd56e2cc docs(skill) — removed the entire "Execution Mode" section from curate.md (lines 33-46), reframed history.md:8 opener away from "trusting detached curate", and replaced onboarding.md:239 Path A's brv curate -f <path> with brv curate "<description>: $(cat <path>)". 3 files changed, +2 / -17.

Verified fixes (from the May 27 review)

# Item Status
Critical-1a curate.md "Execution Mode" section instructing agents to use --detach and <logId> ✅ Section removed entirely (was lines 35-46)
Critical-1b onboarding.md:239 Path A using brv curate -f <path> (the load-bearing tour-failure case) ✅ Replaced with brv curate "<description>: $(cat <path>)" — exactly the suggested fix
Critical-1c history.md:8 framing history around "detached curate work" ✅ Reframed to "Useful for the user asking 'what was decided' or for confirming a recent curate's final shape"

Still standing (carry-over from May 27)

Posted inline:

  1. curate.md:28 — Quick Reference still has brv curate "..." -f src/retry.ts. The first invocation example in the file; agents skim this before reading the session protocol below. Suggested replacement inlines the file context.
  2. curate.md:137 — Common Mistakes row still mentions "detached curate work / verify with brv curate view <logId>". Tool-mode curate is synchronous; the failure mode this row was guarding against no longer exists. Either drop the row or replace with a still-live failure mode.
  3. history.md:12, 46, 48, 58 — four stale "detached curate" references. The body of the file still frames brv curate view as a verification-after-detachment tool; should be reframed as post-hoc inspection. The <logId> argument shape itself is still valid (brv curate view <logId> is a real subcommand).
  4. review.md:38 — describes --disable as suppressing "per-op review marker in detached curate-log entries". Underlying behavior is still correct; the framing is dated.
  5. curate.md:12 + query.md:10 — hardcoded Hermes workdir ~/.hermes/byterover/ ignores HERMES_HOME; double-space typo. Flagged in May 27 review; unchanged.

Other carry-overs from prior reviews (unchanged)

  • curate-log-entry.ts:58 — docstring still references removed useHtmlContextTree flag (May 21 review).
  • curate/index.ts:77-79 + query.ts:64BRV_CURATE_TOOL_MODE / BRV_QUERY_TOOL_MODE "no-op" comments misleading (env vars were removed, not made into no-ops) (May 26 review).
  • daemon-client.ts:36, 38, 41, 181-182, 274-277 + task-validation.ts:26, 36 + tui/utils/error-messages.ts:17, 18, 21 + provider-handler.ts:63 — error messages still reference removed brv providers connect/switch commands. Reachability is now mostly OAuth-refresh on cloud-sync paths.
  • curate-prompt-builder.ts:119-123<existing-topic> block still inlines system-managed createdat/updatedat verbatim.
  • topic-loader.ts:43, 110-155 — regex parser still inconsistent with index-generator.ts:208-222 parse5.
  • migrate.ts:97-104 — VC-sync hint assumes brv vc init has been run; first-time migrators on a fresh .brv/ without .brv/git/ get three GIT_NOT_INITIALIZED errors (May 26 review).
  • MapConnector injection asymmetry + YamlMcpConfigWriter exists/remove malformed-YAML asymmetry (May 27 review).

Resolved (silently, in earlier merges)

  • ConsoleLogger chalk-on-stderr under --format json (May 20 / 22 / 26 reviews) — curate-session.ts no longer constructs new ConsoleLogger() directly. The sidecar bump and index regen now execute in the daemon (per the comments at curate-session.ts:297, 417), which has its own logger plumbing. ENG-2925's "route through daemon" closed this transparently.

Praise (posted inline)

  • onboarding.md:239 — the load-bearing replacement, with one bonus subtlety: pushing the agent to write a description alongside the file content gives the resulting bv-topic better retrieval surface from day 1.

Test plan suggestions

  • A unit test that scans every file in src/server/templates/skill/*.md and src/server/templates/sections/*.md for substrings from CURATE_REMOVED_FLAGS.flags (['--detach', '--files', '-f', '--folder', '-d', '--timeout']) and fails if any appear. Would auto-catch the curate.md:28 and any future drift on flag removals.
  • A grep-based test that ensures the literal word "detached" doesn't appear in any skill file (since tool-mode curate is synchronous and the concept no longer exists in the runtime).

Final disposition

LGTM with the partial-fix completion as a tight follow-up. dd56e2cc fixes the user-most-impactful piece (the onboarding tour) and the largest stale block (curate.md's Execution Mode section). The remaining items in curate.md:28 / :137, history.md:12-58, and review.md:38 are skill-doc fragments — none would crash a curate today, but an agent skimming Quick Reference or Common Mistakes will still see broken framing. The fix is text-only edits in 3 markdown files. Worth either landing in a follow-up commit before merge or as a tight post-merge polish — same pattern the team has consistently shipped on other carry-overs.

Todo

  • Identify changes since prior review baseline (dd56e2cc only)
  • Verify dd56e2cc fix completeness
  • Verify outstanding carry-over items from prior reviews
  • Post inline comments and final summary
    · Branch: proj/byterover-tool-mode

Comment thread src/server/templates/skill/curate.md Outdated

```bash
brv curate "Authentication middleware validates JWTs in src/middleware/auth.ts and rejects expired tokens before route handlers."
brv curate "Retry helper treats HTTP 429 as retryable with exponential backoff." -f src/retry.ts
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (partial address of prior review's critical finding): The "Execution Mode" section was correctly removed in dd56e2cc — good — but this Quick Reference example still uses -f src/retry.ts. --files / -f was removed in ENG-2880 (CURATE_REMOVED_FLAGS in src/oclif/lib/removed-flags.ts:73-91), so an agent that copy-pastes this example into a real curate call gets the migration-error envelope: "--files / -f: This flag has been removed. Tool-mode curate pre-packs file content into the bv-topic HTML directly."

The Quick Reference is the first concrete invocation block in the file — agents skim it before reading the full session protocol below. Inlining the file content matches the post-ENG-2837 model the rest of this file already documents.

Suggested change
brv curate "Retry helper treats HTTP 429 as retryable with exponential backoff." -f src/retry.ts
brv curate "Retry helper treats HTTP 429 as retryable with exponential backoff. Implementation: src/retry.ts uses exponential backoff with jitter."

Fix this →

Comment thread src/server/templates/skill/curate.md Outdated
|---|---|
| Sending markdown or JSON as the session response | Send one bare `<bv-topic>...</bv-topic>` HTML document |
| Omitting `keywords` when retrieval terms are obvious | Add comma-separated `keywords` on `<bv-topic>` |
| Claiming detached curate work is saved immediately | Verify completion with `brv curate view <logId> --format json` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (partial address of prior review): The Common Mistakes row still frames "detached curate work" as a thing — but tool-mode curate is synchronous end-to-end (no --detach flag, no log-id-printed-on-completion). Per dd56e2cc's removal of the Execution Mode section, this row is now an orphan — it documents a workflow that no longer exists.

Since every curate is synchronous, there's no "claimed saved before it completes" failure mode left for this row to guard against. Cleanest fix: drop the row entirely.

If you want to keep a row in this slot for a still-live failure mode, the prior reviews surfaced one candidate: "Reporting sessionId as a save confirmation when the session is still in needs-llm-step state" — that's the closest analog under the new model.

Suggested change
| Claiming detached curate work is saved immediately | Verify completion with `brv curate view <logId> --format json` |
| Reporting completion before a session reaches `data.status: "done"` | Wait for `done` before telling the user the topic is saved |

Fix this →

Comment thread src/server/templates/skill/history.md Outdated
- You need to verify a detached `brv curate` completed.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (partial address of prior review): Line 8's opener was correctly reframed in dd56e2cc — but four other "detached curate" references in this same file weren't touched:

  • Line 12 (this line): "You need to verify a detached brv curate completed."
  • Line 46: "Use brv curate view <logId> --format json when verifying a detached curate before relying on it."
  • Line 48: "If a detached curate is still processing, wait or tell the user it is not complete."
  • Line 58: Common Mistakes row "Trusting a detached curate without checking status".

All four reference the deprecated detached-curate concept (--detach removed in ENG-2880; tool-mode curate is synchronous). An agent reading history.md linearly today gets a coherent opener (good), then hits four contradictory references that frame brv curate view as a verification-after-detachment tool rather than what it now is: a post-hoc inspection tool for sessions whose status the caller already knows.

brv curate view <logId> IS still a real subcommand (src/oclif/commands/curate/view.ts:14-20 takes a "Log entry ID") — so the <logId> examples in the Quick Reference at l.24-29 are fine. It's the "verify-after-detach" framing that's stale, not the <logId> argument shape.

Suggested fix for the "When To Inspect History" bullet here:

Suggested change
- You need to verify a detached `brv curate` completed.
- The user asks what was recently queried, curated, or what files a recent curate touched.

And drop the parallel framing from lines 46, 48, and 58 in the same pass.

Fix this →

Comment thread src/server/templates/skill/review.md Outdated

## Toggle The Review Log

`brv review` with no subcommand shows whether review is on or off. `--disable` stops `brv curate` from prompting for review on high-impact ops, suppresses the per-op review marker in detached curate-log entries, and prevents `brv dream` from queueing review items. `--enable` reverses all of the above. Existing pending items are unaffected — they remain listed by `brv review pending` and can still be approved or rejected.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (post-dd56e2cc ripple): This line says --disable "suppresses the per-op review marker in detached curate-log entries" — but post-ENG-2880, every curate is synchronous; there's no "detached" curate-log entry anymore. The underlying behavior (the --disable flag short-circuits the per-op review-marker write path) is still correct, but the framing is dated.

The June 2025 behavior the line is describing is preserved by resolveReviewDisabled (src/server/infra/daemon/agent-process.ts and the AsyncLocalStorage plumbing), regardless of foreground vs background. Drop the "detached" qualifier:

Suggested change
`brv review` with no subcommand shows whether review is on or off. `--disable` stops `brv curate` from prompting for review on high-impact ops, suppresses the per-op review marker in detached curate-log entries, and prevents `brv dream` from queueing review items. `--enable` reverses all of the above. Existing pending items are unaffected — they remain listed by `brv review pending` and can still be approved or rejected.
`brv review` with no subcommand shows whether review is on or off. `--disable` stops `brv curate` from prompting for review on high-impact ops, suppresses per-op review markers in curate-log entries, and prevents `brv dream` from queueing review items. `--enable` reverses all of the above. Existing pending items are unaffected — they remain listed by `brv review pending` and can still be approved or rejected.

Non-blocking — the description still parses correctly, but it's the same drift the dd56e2cc commit was tightening.

Fix this →


> "One more thing — your persona is saved, but I don't know your codebase yet. I see `CLAUDE.md` in your repo. Want me to curate it as starter project context? Takes about 30 seconds."

If the user says yes, read the file yourself and curate its content: `brv curate "<one-line description>: $(cat <path>)"` (or the tool-mode session form). Confirm with one line.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: This is the most user-impactful piece of the prior review's critical finding — the onboarding tour's Path A would have hit brv curate -f <path> immediately after install and surfaced the ENG-2880 migration error at the most user-visible moment of the entire connector flow. Replacing with brv curate "<description>: $(cat <path>)" is exactly the post-ENG-2837 pattern (calling agent pre-packs file content into the bv-topic HTML) and matches what the rest of the skill files already document.

The replacement is also better-shaped than the original in a subtle way: it pushes the agent to write a description of what the file is, not just pipe bytes through. That description ends up in the bv-topic summary attribute downstream, giving the topic better retrieval surface from day 1.

DatPham-6996 and others added 6 commits May 27, 2026 15:14
- Update curate examples to include implementation context inline
- Replace detached-curate wording with session completion status guidance
- Clarify history and review template language around recent curates
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Claude finished @danhdoan's task in 2m 38s —— View job


Re-review of PR #663 (verification pass — no new PR-specific commits since dd56e2cc)

Score: 4.5 / 5 — held from the prior review. No regressions, but the partial-fix carry-overs from the May 27 cycle remain on disk.

What changed since the prior review

Walking the first-parent history from the May 27 baseline (dd56e2cc):

Commit Scope
11adb44a Merge proj/byterover-tool-mode into the ENG-2697 feature branch (no new content)
79a2dbfc PR #717 merge — dd56e2cc itself, already reviewed at May 27 07:50
d810f0d5 Release/3.16.1 — auth-handler change in a main release that landed via PR #720, reviewed in its own PR
90a6ba77 Main-sync merge pulling v3.16.1 into this branch (5 files, +99/-3)

No PR-specific code or doc commits have landed since dd56e2cc. The remaining surface to evaluate is the partial-fix state called out in the May 27 second review.

Carry-over verification

All five still-outstanding items from the May 27 07:56 review are confirmed unchanged on disk:

  1. src/server/templates/skill/curate.md:28 — Quick Reference still has brv curate "..." -f src/retry.ts. -f/--files was removed in ENG-2880 (CURATE_REMOVED_FLAGS in src/oclif/lib/removed-flags.ts:80-83). Agent copy-pasting the example hits the migration-error envelope.
  2. src/server/templates/skill/curate.md:137 — Common Mistakes row still references "detached curate work" + <logId> verification. Tool-mode curate is synchronous; the failure mode this row guards against no longer exists.
  3. src/server/templates/skill/history.md:12, 46, 48, 58 — four "detached curate" references unchanged; brv curate view <logId> is still framed as verify-after-detachment rather than post-hoc inspection.
  4. src/server/templates/skill/review.md:38--disable "suppresses the per-op review marker in detached curate-log entries" still has the detached qualifier.
  5. src/server/templates/skill/curate.md:12 + query.md:10 — Hermes workdir still hardcoded as ~/.hermes/byterover/ (ignores HERMES_HOME); both retain the double-space typo at \workdir``.

Other carry-overs still standing (status quo, reachability shrinking):

  • src/oclif/lib/daemon-client.ts:36, 38, 41, 181-182, 274, 276 + src/tui/utils/error-messages.ts:21 + src/server/infra/daemon/task-validation.ts:26, 36 — six locations still reference removed brv providers connect/switch and /provider REPL commands. Reachability is near-zero post-ENG-2925 (tool-mode types skip validateProviderForTask), but OAuth-refresh on cloud-sync (brv push/pull) still emits the wrong advice; correct replacement is brv login.
  • src/server/core/domain/render/curate-prompt-builder.ts:119-123<existing-topic> block still inlines verbatim createdat/updatedat that the schema's .superRefine now rejects.
  • src/server/infra/dream/tool-mode/topic-loader.ts:43, 110-155 — regex parser still inconsistent with src/server/infra/context-tree/index-generator.ts:208-222 parse5 for the same <bv-topic> attribute extraction.

Why I'm not posting inline comments this round

The May 27 07:56 review already posted inline comments at exactly these locations (curate.md:28, curate.md:137, history.md:12, review.md:38, curate.md:12). Duplicating them on the same lines without new commits would clutter the review thread; the prior comments still apply verbatim.

If the intent is to land the cleanup before merge, the fix is text-only edits in three markdown files plus the four daemon-client/task-validation/error-messages locations — matches the pattern of dd56e2cc (which knocked down the highest-impact piece, the onboarding tour). Fix the remaining skill-doc drift →

Final disposition

LGTM with the doc-cleanup follow-up. The PR-specific code surface is unchanged since the May 27 review, so the prior 4.5/5 holds. The team has shown a consistent pattern of post-merge polish on carry-overs (8 of 9 critical-class items from the original May 15 review have closed), and the remaining items are non-blocking text edits.

If you'd like me to actually make the doc edits + the daemon-client/task-validation/error-messages text fixes in a commit on this branch, reply with @claude fix the carry-over items and I'll push the patch directly.

Todo

  • Identify changes since prior review baseline (dd56e2cc — no PR-specific commits)
  • Verify the carry-over items the May 27 review flagged
  • Spot-check disk state for the daemon-client / task-validation / topic-loader references
  • Post final summary with score
    · Branch: proj/byterover-tool-mode

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

Claude finished @DatPham-6996's task in 3m 8s —— View job


Re-review of PR #663 after the PR #723 skill-doc cleanup

Four PR-specific commits since dd56e2cc (the prior review's baseline), all in PR #723 (feat/ENG-2697 continuation): ba1863b6c9c76148533bb0a085cc6abf. Plus the main-sync merge 90a6ba77 pulling v3.16.1 (PR #720, reviewed in its own thread).

The four commits form a tight skill-doc cleanup pass that closes 4 of the 5 outstanding carry-over items from the May 27 07:56 second review — three of them landing with my exact suggested wording.

Score: 4.75 / 5 — net improvement from 4.5/5. The highest-impact skill-doc drift is fully resolved.

What changed since dd56e2cc

Commit Scope
ba1863b6 docs(skill): clarify curate completion and review guidance — addresses curate.md:28 (-f flag), curate.md:137 (Common Mistakes detached row), history.md:12 (When To Inspect History bullet), review.md:38 (detached qualifier).
c9c76148 Removes a duplicated "user asks what was recently queried or curated" bullet in history.md — a leftover from the original dd56e2cc change. Small hygiene fix.
533bb0a0 Closes the remaining three detached-curate references in history.md body (lines 46, 48, 58) — rewriting the Curate History paragraph, deleting the "if a detached curate is still processing" line entirely, and recasting the Common Mistakes row to "Trusting a recent curate without confirming it reached completed".
85cc6abf Refines curate.md:28 once more: single-sentence form Retry helper in src/retry.ts treats HTTP 429 as retryable with exponential backoff. — drops the slightly redundant "Implementation:" prefix from ba1863b6 for a cleaner shape.

Carry-over status

Closed in this batch (4 of 5):

# Item Closing commit
1 curate.md:28 Quick Reference using brv curate "..." -f src/retry.ts ba1863b6 + refined 85cc6abf
2 curate.md:137 Common Mistakes row about "detached curate work" + <logId> ba1863b6 (exact wording from my prior suggestion)
3 history.md:12, 46, 48, 58 — four "detached curate" references ba1863b6 (l.12) + 533bb0a0 (l.46, 48, 58)
4 review.md:38 detached qualifier on curate-log entries ba1863b6 (exact wording from my prior suggestion)

⚠️ Still standing:

  • curate.md:12 + query.md:10 — Hermes workdir hardcoded as ~/.hermes/byterover/; ignores HERMES_HOME env override; double-space typo at at \workdir`. Confirmed unchanged via grep -n "hermes" src/server/templates/skill/*.md. Posted inline at curate.md:12in the May 27 07:14 review. Cleanest fix: move intobyterover-rules-block.mdwith env-var-aware resolution (that file already gets injected into HermesSOUL.md` via the autonomous-attachment patcher).

Other carry-overs from prior reviews (unchanged)

  • daemon-client.ts:36, 38, 41, 181, 182, 274, 276 + tui/utils/error-messages.ts:17, 18, 21 + task-validation.ts:26, 36 — error messages still reference removed brv providers connect/switch and /provider REPL. Reachability is mostly OAuth-refresh on cloud-sync paths (brv push/pull/vc push); recommended replacement is brv login.
  • curate-prompt-builder.ts:119-123<existing-topic> block still inlines system-managed createdat/updatedat verbatim. Heavier traffic now with tool-mode dream MERGE candidates exercising the UPDATE path under meta.previousSummary. One-regex strip pass before the prompt builder shape settles permanently.
  • topic-loader.ts:43, 110-155 — regex parser still inconsistent with index-generator.ts:208-222 parse5. Confirmed unchanged.

Praise

  • 533bb0a0 rewrite of history.md's body is unusually clean: instead of mechanically removing the word "detached" everywhere, the paragraph at line 45 was rewritten to honestly describe the new post-hoc inspection model ("verifying a recent curate run before relying on the saved topic"), the next paragraph was deleted entirely (it documented a workflow that no longer exists), and the Common Mistakes row was recast to a still-live failure mode ("Trusting a recent curate without confirming it reached completed"). That last one matters — most teams would have left a vacant row or a dead reference.
  • 85cc6abf second-pass refinement of the Quick Reference example. The ba1863b6 version (...exponential backoff. Implementation: src/retry.ts uses exponential backoff with jitter.) parsed as two clauses joined by an "Implementation:" tag. The new single-sentence form with the file path inline (Retry helper in src/retry.ts treats HTTP 429 as retryable with exponential backoff.) reads more naturally and avoids a redundancy between the leading clause and the trailing one. Small detail; demonstrably read-aloud-tested.
  • c9c76148 — catching a duplicated bullet introduced by the previous round's edit is the kind of meticulous self-review that prevents documentation rot from accumulating. Cheap to write, easy to miss.

Test plan suggestions

  • Optional: a unit test that scans every file in src/server/templates/skill/*.md for the literal word "detached" and asserts the count is zero. Since tool-mode curate is synchronous end-to-end, the concept no longer exists in the runtime, and any future re-introduction of the word in skill docs would be drift worth catching automatically. Same shape as the suggested CURATE_REMOVED_FLAGS substring test from earlier reviews.

Final disposition

LGTM. PR #723 is exemplary skill-doc cleanup — four scoped commits taking down four of the five outstanding items from the previous review, with three closures landing in the exact wording suggested. The Hermes workdir item is the last live skill-doc carry-over; everything else is non-skill carry-over that has been deferred across multiple reviews.

Todo

  • Identify what's changed since the prior review baseline (dd56e2cc)
  • Verify the four new docs commits close the May 27 carry-over items
  • Spot-check the remaining live carry-over (Hermes workdir + provider strings + existing-topic block + topic-loader regex)
  • Post final summary with score

· Branch: proj/byterover-tool-mode
·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants