-
Notifications
You must be signed in to change notification settings - Fork 195
Add reranked_search tool #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| # Reranked Search Source | ||
|
|
||
| A reranking layer over multiple search tools for NeMo Agent Toolkit workflows. Fans out a query to configured search tools in parallel, reranks them by relevance, and filter top k results. | ||
|
|
||
| ## How It Works | ||
| 1. The reranker receives a query from the agent | ||
| 2. Calls all `search_tools` in parallel with the same query | ||
| 3. Scores and ranks all results across sources using the cross-encoder reranking model | ||
| 4. Returns the top-k results to the agent | ||
|
|
||
| ## Environment Variables | ||
|
|
||
| By default, reranker is invoked from nvidia.build.com and requires NVIDIA_API_KEY to run model inference: | ||
|
|
||
| ```bash | ||
| NVIDIA_API_KEY=your_nvidia_api_key | ||
| ``` | ||
|
|
||
| ## Example Workflow Configuration | ||
|
|
||
| Define the reranked_search tool and other search tools that feed into the reranker. If the search tools are part of a function group, they must be specified in the group's `- include:` list, and use `{group_name}__{tool_name}` format in `reranked_search` config section. | ||
|
|
||
| For more info on function group name space, reference Nemo Agent Toolkit doc, specifically [Function Naming and Namespaing](https://docs.nvidia.com/nemo/agent-toolkit/latest/build-workflows/functions-and-function-groups/function-groups.html#function-naming-and-namespacing) and [Understanding Function Accessibility](https://docs.nvidia.com/nemo/agent-toolkit/latest/build-workflows/functions-and-function-groups/function-groups.html#understanding-function-accessibility). | ||
|
|
||
| ```yaml | ||
| function_groups: | ||
| your_group: | ||
| _type: your_group | ||
| include: [tool_1, tool_2] | ||
| ... | ||
|
|
||
| functions: | ||
| web_search_tool: | ||
| _type: tavily_web_search | ||
| max_results: 5 | ||
|
|
||
| your_custom_search_tool: | ||
| _type: your_custom_search | ||
| ... | ||
|
|
||
| reranked_search: | ||
| _type: reranked_search | ||
| # required configs | ||
| cross_encoder_model: nv-rerank-qa-mistral-4b:1 | ||
| search_tools: | ||
| - web_search_tool # standalone function examples | ||
| - your_custom_search_tool | ||
| - your_group__tool_1 # function group examples | ||
| - your_group__tool_2 | ||
|
|
||
| # # uncomment to adjust default values | ||
| # top_k: 5 # adjust as necessary as you add more search tools, meaning more results to rerank. | ||
| # timeout_seconds: 10 # per-tool timeout | ||
| ``` | ||
|
|
||
| Then give it to an agent as its only tool: | ||
|
|
||
| ```yaml | ||
| shallow_research_agent: | ||
| _type: shallow_research_agent | ||
| llm: nemotron_nano_llm | ||
| tools: | ||
| - reranked_search | ||
| ``` | ||
|
|
||
| See `sources/reranker/example_cli_config.yml` for a full working example. | ||
|
|
||
| ### Supported Reranker Models | ||
| Choose any rerank model from build.nvidia.com | ||
|
|
||
| ## Make Your Source Compatible with Reranker | ||
| All built-in sources under `./sources` folder are already supported. | ||
|
|
||
| To design a new source that supports reranking, there's only one condition: | ||
| * use `aiq_agent.common.SOURCE_DELIMITER` to join all the search result strings returned by your tool. The reranker tool will use the same delimiter to break down the long string into seperate sources and rerank them by relevance. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| # This is the default configuration for the CLI mode. | ||
| # It has the following features: | ||
| # - Human-in-the-loop clarification and plan approval before deep research starts | ||
| # - Web search and Paper search tools by default | ||
| # - There is no knowledge retrieval | ||
|
|
||
| general: | ||
| telemetry: | ||
| logging: | ||
| console: | ||
| _type: console | ||
| level: INFO | ||
| # tracing: | ||
| # langsmith: # Optional: LangSmith tracing - requires langsmith API key. Set using `export LANGSMITH_API_KEY=<your-langsmith-api-key>` | ||
| # _type: langsmith | ||
| # project: nvidia-aiq | ||
|
|
||
| llms: | ||
| nemotron_llm_intent: | ||
| _type: nim | ||
| model_name: nvidia/nemotron-3-nano-30b-a3b | ||
| base_url: "https://integrate.api.nvidia.com/v1" | ||
| temperature: 0.5 | ||
| top_p: 0.9 | ||
| max_tokens: 4096 | ||
| num_retries: 5 | ||
| chat_template_kwargs: | ||
| enable_thinking: true | ||
|
|
||
| nemotron_nano_llm: | ||
| _type: nim | ||
| model_name: nvidia/nemotron-3-nano-30b-a3b | ||
| base_url: "https://integrate.api.nvidia.com/v1" | ||
| temperature: 0.5 | ||
| top_p: 0.9 | ||
| max_tokens: 4096 | ||
| num_retries: 5 | ||
| chat_template_kwargs: | ||
| enable_thinking: false | ||
|
|
||
| gpt_oss_llm: | ||
| _type: nim | ||
| model_name: openai/gpt-oss-120b | ||
| base_url: https://integrate.api.nvidia.com/v1 | ||
| temperature: 1.0 | ||
| top_p: 1.0 | ||
| max_tokens: 256000 | ||
| api_key: ${NVIDIA_API_KEY} | ||
| max_retries: 10 | ||
|
|
||
| # Nemotron Super is compatible and tested with AIQ but has limited availability | ||
| # on the Build API due to high demand. | ||
| # Uncomment nemotron_super_llm below if the endpoint is accessible. | ||
| # nemotron_super_llm: | ||
| # _type: nim | ||
| # model_name: nvidia/nemotron-3-super-120b-a12b | ||
| # base_url: "https://integrate.api.nvidia.com/v1" | ||
| # temperature: 1.0 | ||
| # top_p: 1.0 | ||
| # max_tokens: 128000 | ||
| # num_retries: 5 | ||
| # chat_template_kwargs: | ||
| # enable_thinking: true | ||
|
|
||
| functions: | ||
| # ========================================================================= | ||
| # Data Source Registry | ||
| # ========================================================================= | ||
| # Central registry that controls: | ||
| # 1. UI toggles — each source appears as an on/off switch in the frontend | ||
| # 2. Per-message filtering — users can select active sources per request | ||
| # 3. Tool auto-inheritance — agents with no explicit `tools` list receive | ||
| # every tool listed here (use `exclude_tools` on agents to specialize) | ||
| # | ||
| # Source entry fields: | ||
| # id, name, description, tools, requires_auth (default: false), | ||
| # default_enabled (default: true) | ||
| # | ||
| # See docs/source/customization/tools-and-sources.md for full details. | ||
| # ========================================================================= | ||
| data_sources: | ||
| _type: data_source_registry | ||
| sources: | ||
| - id: web_search | ||
| name: "Web Search" | ||
| description: "Search the web for real-time information." | ||
| tools: | ||
| - web_search_tool | ||
| - advanced_web_search_tool | ||
| # - id: paper_search | ||
| # name: "Academic Papers" | ||
| # description: "Search academic papers and scientific publications." | ||
| # tools: | ||
| # - paper_search_tool | ||
|
|
||
| web_search_tool: | ||
| _type: tavily_web_search | ||
| max_results: 5 | ||
| max_content_length: 1000 | ||
|
|
||
| advanced_web_search_tool: | ||
| _type: tavily_web_search | ||
| max_results: 2 | ||
| advanced_search: true | ||
|
|
||
| # Paper Search (optional - requires SERPER_API_KEY) | ||
| # Uncomment the block below and set SERPER_API_KEY to enable academic paper search. | ||
| paper_search_tool: | ||
| _type: paper_search | ||
| max_results: 5 | ||
| serper_api_key: ${SERPER_API_KEY} | ||
|
|
||
| # ======================================================================== | ||
| # Example reranked_search tool config. Useful when 2 or more sources are given to the agent. | ||
| # ======================================================================== | ||
| reranked_search: | ||
| _type: reranked_search | ||
| cross_encoder_model: nv-rerank-qa-mistral-4b:1 | ||
| search_tools: | ||
| - web_search_tool | ||
| - paper_search_tool | ||
| top_k: 5 | ||
|
Jack-Yu-815 marked this conversation as resolved.
|
||
|
|
||
| # ========================================================================= | ||
| # Agents — inherit all registry tools; use exclude_tools to specialize | ||
| # ========================================================================= | ||
| intent_classifier: | ||
| _type: intent_classifier | ||
| llm: nemotron_llm_intent | ||
| # tools: omitted -> inherits all from data_source_registry | ||
| # exclude_tools: [] | ||
| # llm_timeout: 90 # optional; seconds for intent LLM call (default 90) | ||
|
|
||
| clarifier_agent: | ||
| _type: clarifier_agent | ||
| llm: nemotron_nano_llm # replace with nemotron_super_llm if available | ||
| planner_llm: nemotron_nano_llm # replace with nemotron_super_llm if available | ||
| # tools: omitted -> inherits all from data_source_registry | ||
| # exclude_tools: [] | ||
| max_turns: 3 | ||
| enable_plan_approval: true | ||
| log_response_max_chars: 2000 | ||
| verbose: true | ||
|
|
||
| shallow_research_agent: | ||
| _type: shallow_research_agent | ||
| llm: nemotron_nano_llm | ||
| # tools: omitted -> inherits all from data_source_registry | ||
| tools: | ||
| - reranked_search | ||
| # exclude_tools: | ||
| # - advanced_web_search_tool | ||
| max_llm_turns: 10 | ||
| max_tool_iterations: 5 | ||
|
|
||
| deep_research_agent: | ||
| _type: deep_research_agent | ||
| orchestrator_llm: gpt_oss_llm | ||
| researcher_llm: nemotron_nano_llm # replace with nemotron_super_llm if available | ||
| planner_llm: gpt_oss_llm | ||
| max_loops: 2 | ||
| tools: | ||
| - reranked_search | ||
|
|
||
| workflow: | ||
| _type: chat_deepresearcher_agent | ||
| enable_escalation: true | ||
| enable_clarifier: true | ||
| checkpoint_db: ${AIQ_CHECKPOINT_DB:-./checkpoints.db} | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| [build-system] | ||
| build-backend = "setuptools.build_meta" | ||
| requires = ["setuptools >= 64", "setuptools-scm>=8"] | ||
|
|
||
| [tool.setuptools] | ||
| packages = ["reranked_search"] | ||
| package-dir = {"reranked_search" = "src"} | ||
|
|
||
| [project] | ||
| name = "reranked-search" | ||
| version = "1.0.0" | ||
| description = "Reranking layer over multiple search tools (BM25 and dense retrieval)" | ||
| readme = "README.md" | ||
| requires-python = ">=3.11,<3.14" | ||
| license = {text = "Apache-2.0"} | ||
| dependencies = [ | ||
| "nvidia-nat==1.5.0", | ||
| "pydantic>=2.0.0", | ||
| "langchain-core>=1.1.0", | ||
| "langchain-nvidia-ai-endpoints>=1.1.0" | ||
|
Comment on lines
+28
to
+35
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
dependencies = [
"aiq-agent",
"nvidia-nat==1.5.0",
"pydantic>=2.0.0",
"langchain-core>=1.1.0",
"langchain-nvidia-ai-endpoints>=1.1.0"
] |
||
| ] | ||
|
|
||
| [project.entry-points."nat.plugins"] | ||
| reranked_search = "reranked_search.register" | ||
Uh oh!
There was an error while loading. Please reload this page.