mini-hermes——web_search的实现

## web_search 工具的实现
### 整体流程（业界通用模式）
用户提问 -> llm 判断需要外部信息 -> 调用 web_search(query) -> 搜索api返回结果 -> 喂给llm -> llm综合回答

**两种架构**
 - 单轮模式：user -> llm -> web_search -> result -> llm -> answer
 - 多轮模式：user -> llm -> web_search(1) -> result -> llm(2) -> 判断是否还需要信息 -> web_search(2) -> result -> llm -> 最终答案

### 搜索api的对比
方案              免费额度            需要 API Key  返回质量    适合场景
    ────────────────────────────
    Tavily             1000次/月免费        需要           ★★★★★    AI agent 首选，专为 LLM 设计
    DuckDuckGo 抓取    无限制(有风控)       不需要         ★★★☆☆    零成本快速启动
    SearXNG 自建       无限                不需要         ★★★★☆    隐私好，聚合多引擎
    Brave Search       2000次/月免费        需要(免费注册)  ★★★★☆    质量好，免费额度大
    Google CSE         100次/天免费         需要           ★★★★☆    免费额度小
    Bing Search        1000次/月免费        需要(Azure)    ★★★★☆    微软生态
    SerpAPI            100次/月免费         需要           ★★★★★    老牌，支持多搜索引擎

### 各个方案对比
Duckduckgo 起步方案
- npm 包: duck-duck-scrape
- 零配置，不需要任何 API Key
- 直接抓取 DuckDuckGo 搜索页面
- 风险：可能被限流，不保证长期稳定
- 适合：MVP 阶段、个人项目

Tavily API（推荐正式使用）
    - 专为 AI agent 设计的搜索引擎
    - 返回 AI 优化过的干净文本，不需要额外解析 HTML
    - 支持 include_answer 参数，直接返回预生成的答案
    - 返回格式：{ answer: "...", results: [{title, url, content, score}] }
    - 有免费额度，超出按量付费
    - 适合：对搜索质量有要求的正式项目

SearXNG 自建
    - Docker 一行命令部署: docker run -d -p 8080:8080 searxng/searxng
    - 聚合 Google/Bing/DuckDuckGo 等 70+ 搜索引擎
    - 返回 JSON 格式，无限次调用
    - 缺点：需要维护一个服务，搜索质量取决于上游引擎
    - 适合：对隐私/自主可控有要求的场景

### 数据处理
- 搜索到结果后，需要决定返回什么给LLM
1、只返回摘要：title + URL + snippet 节省token，适合大多数问答场景
2、抓取完整页面在提取：fetch + cheerio 解析html，提取正文，去掉导航等，token消耗大，需要截取，适合深入阅读的场景
3、jina reader api，访问 r.jina.ai/https://example.com 即可获取干净文本，免费额度1000次/天


### 代码实现
为支持不同的工具，主流程不应该耦合某个工具的逻辑，具体代码如下
```
export interface ToolDefinition { // 工具的定义
  name: string
  description: string
  input_schema: {
    type: 'object'
    properties?: unknown | null
    required?: Array<string> | null
    [k: string]: unknown
  }
  execute: (input: Record<string, unknown>) => Promise<string>
}

export interface SearchResult {
  title: string
  url: string
  snippet: string
}

export interface SearchResponse {
  answer?: string
  results: SearchResult[]
}

export interface SearchProvider {
  name: string
  search: (query: string, options?: SearchOptions) => Promise<SearchResponse>
}

export interface SearchOptions {
  maxResults?: number
  includeAnswer?: boolean
}

export const webSearchTool: ToolDefinition = {
  name: 'web_search',
  description: '使用外部搜索引擎搜索网页结果，返回 title + URL + snippet 的摘要。当前默认使用 DuckDuckGo。',
  input_schema: {
    type: 'object',
    properties: {
      query: { type: 'string', description: '要搜索的问题或关键词' },
      max_results: { type: 'number', description: '返回结果数量，默认 5，最大 10' },
      provider: { type: 'string', description: '搜索提供商，目前支持 duckduckgo / tavily。当某个 provider 执行报错或超时时，可以切换到另一种 provider。' },
    },
    required: ['query'],
  },
  execute: async (input) => {
    const { query, max_results, provider } = input as WebSearchInput
    const q = query

    const providerName = provider || 'duckduckgo'
    const providerImpl = getSearchProvider(providerName)

    const maxResults = max_results ?? 5
    const response = await providerImpl.search(q, {
      maxResults: Math.max(1, Math.min(maxResults, 10)),
      includeAnswer: true,
    })
    return summarizeResults(providerImpl.name, q, response)
  },
}
```

注意：provider字段的描述信息 -> 当某个 provider 执行报错或超时时，可以切换到另一种 provider。LLM会自动切换搜索工具


## 如果没有注册有token花销的搜索工具，hermes是怎么做的呢？难道就没有网络搜索的功能了吗？


## 和生产环境的差距






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mini-hermes——web_search的实现 #67

web_search 工具的实现

整体流程（业界通用模式）

搜索api的对比

各个方案对比

数据处理

代码实现

如果没有注册有token花销的搜索工具，hermes是怎么做的呢？难道就没有网络搜索的功能了吗？

和生产环境的差距

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

mini-hermes——web_search的实现 #67

Description

web_search 工具的实现

整体流程（业界通用模式）

搜索api的对比

各个方案对比

数据处理

代码实现

如果没有注册有token花销的搜索工具，hermes是怎么做的呢？难道就没有网络搜索的功能了吗？

和生产环境的差距

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions