Skip to content

fix: strip think tags before extracting title when saving to wiki#406

Open
shisonghong-git wants to merge 2 commits into
nashsu:mainfrom
shisonghong-git:fix/save-to-wiki-title-think-tags
Open

fix: strip think tags before extracting title when saving to wiki#406
shisonghong-git wants to merge 2 commits into
nashsu:mainfrom
shisonghong-git:fix/save-to-wiki-title-think-tags

Conversation

@shisonghong-git

@shisonghong-git shisonghong-git commented Jun 16, 2026

Copy link
Copy Markdown

Fixes #404

问题

使用推理模型(如 deepseek-r1)时,LLM 回复可能以 <think>…</think> 开头。之前的代码先从原始内容提取标题、再清理 think 标签,导致保存的 query 页面标题被推理过程污染。此外,即使取到正确内容,把回答首行当标题本身也不理想——回答首行常是套话,且过长时会被硬截、含截图时会泄漏 base64。

修复

  1. 清理提前:先剥离 <think> 标签和 sources 注释,再提取标题。
  2. 优先用提问做标题:标题改为优先取用户的提问(回答首行仅作回退)。
  3. 长问题摘要:新增 deriveTitleFromQuestion(),过长时在 60 字符预算内按最近的句读/分句边界截断并加 ,避免断在词中间;找不到边界才硬截。
  4. 截图处理:剥离 ![...](...) 图片 markdown 和裸 data:image;base64,...;纯图片提问得到空标题 → 回退到回答首行,图文混合则只保留文字。

测试

  • deriveTitleFromQuestion 单元测试:短问题、换行折叠、空/纯图片、图文混合、超长摘要、无边界硬截,全部通过。
  • TypeScript 类型检查通过。

Reasoning models (e.g. deepseek-r1) may start their reply with
<think>…</think>. The previous code extracted the title from the raw
content first and cleaned think tags second, causing the saved
query page title to be polluted with the model's internal
reasoning text instead of the actual answer.

Move content cleanup before title extraction so the saved title
reflects the real answer content.
@shisonghong-git

Copy link
Copy Markdown
Author

可解决issue #414 (comment)

…pping

The saved query title now prefers the user's question. Long questions are
summarized at the nearest sentence/clause boundary (with an ellipsis) instead
of a hard mid-word slice, and image markdown / base64 data URIs are stripped so
image-only or image-heavy questions don't leak a blob into the title (falling
back to the answer's first line when no usable question text remains).

Refs nashsu#404
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

保存问答时标题不对

1 participant