Add NovaBot and NovaBot Simulator agents #2
Conversation
…oints, and documentation. Introduced metadata and vector files for enhanced functionality. Updated shared components for improved modularity and tracing integration.
📝 WalkthroughWalkthroughAdds a RAG-enabled NovaBot agent and simulator, local scraping and embedding-index build scripts, config/dataclasses and requirements, API/registry enhancements (metadata forwarding, /novabot/chat), shared export adjustments, and prebuilt Noveum docs index artifacts. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant API as "FastAPI /novabot/chat"
participant Registry as "AgentRegistry"
participant Agent as "NovaBotAgent"
participant Retriever as "NumpyVectorRetriever"
participant LLM as "LLM (OpenAI/Gemini)"
participant Memory as "SessionMemory"
User->>API: POST /novabot/chat (message, user_id, metadata)
API->>Registry: get_agent_instance("business.nova_bot")
Registry-->>API: NovaBotAgent
API->>Agent: chat(message, user_id, metadata)
activate Agent
Agent->>Memory: _resolve_session_id(user_id, metadata)
Memory-->>Agent: session_id
Agent->>Memory: _get_chat_history_messages(session_id)
Memory-->>Agent: history
Agent->>Agent: _needs_rag(message)
alt needs RAG
Agent->>Retriever: aget_top_k(query)
activate Retriever
Retriever->>Retriever: embed_query & search vectors.npy
Retriever-->>Agent: RetrievedChunk[]
deactivate Retriever
Agent->>Agent: _build_chain(with retrieved context)
else no RAG
Agent->>Agent: _build_chain(system + history)
end
Agent->>LLM: invoke(prompt)
activate LLM
LLM-->>Agent: response
deactivate LLM
Agent->>Agent: format_response(with rag meta)
Agent-->>API: response_str + metadata
deactivate Agent
API-->>User: 200 OK (response)
sequenceDiagram
actor Tester
participant Simulator as "NovaBotSimulator"
participant Driver as "SessionRunner"
participant Agent as "NovaBotAgent"
participant Aggregator as "MetricsAggregator"
Tester->>Simulator: run_simulation()
Simulator->>Simulator: load_config() & load_novabot_agent()
Simulator->>Simulator: generate_question_pools()
loop per session
Simulator->>Driver: run_session(session_id, is_adversarial)
activate Driver
loop per question
Driver->>Agent: chat(question)
Agent-->>Driver: response + meta
Driver->>Driver: record interaction (latency, rag, chunks)
end
Driver-->>Simulator: SessionResult
deactivate Driver
end
Simulator->>Aggregator: aggregate_results(sessions)
Aggregator-->>Simulator: SimulationResults
Simulator->>Simulator: save_results(JSON) & print_summary()
Simulator-->>Tester: SimulationResults
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (13)
src/api/main.py (1)
226-232: Broaden TypeError handling could mask other issues.The try/except block catches
TypeErrorto handle agents that don't support the metadata parameter. However, this could also catch other TypeErrors (e.g., wrong argument types, other signature mismatches), potentially masking real bugs.Consider using introspection to check the signature instead:
🔎 Proposed refactor using signature inspection
+import inspect + # Invoke agent -# Backwards compatible: only pass metadata if the agent accepts it. -try: - response = await agent_instance.chat( - request.message, request.user_id, request.metadata or {} - ) -except TypeError: - response = await agent_instance.chat(request.message, request.user_id) +# Backwards compatible: only pass metadata if the agent accepts it. +sig = inspect.signature(agent_instance.chat) +if 'metadata' in sig.parameters: + response = await agent_instance.chat( + request.message, request.user_id, request.metadata or {} + ) +else: + response = await agent_instance.chat(request.message, request.user_id)This approach explicitly checks for metadata parameter support without catching other potential TypeErrors.
src/agents/basic/simple-chat-agent/main.py (1)
21-33: Shadowedsysimport inside function.Line 23 imports
syslocally, butsysis already imported at module level (line 17). The local import is technically redundant since the module-level import is already available.🔎 Suggested simplification
def _load_local_config(): import importlib.util - import sys from pathlib import Path config_path = Path(__file__).resolve().with_name("config.py")src/agents/business/nova_bot_simulator/README.md (2)
110-134: Add language specifier to fenced code block.The code block showing console output should have a language specifier (e.g.,
textorconsole) for consistent markdown linting.🔎 Suggested fix
-``` +```text 🤖 NovaBotSimulator initialized
62-75: Consider clarifying the duplicateOPENAI_API_KEYreference.The same
OPENAI_API_KEYvariable appears twice in the example (lines 64 and 69). While this is technically correct (same key used for both LLM and embeddings), consider adding a brief note that a single key serves both purposes to avoid confusion.src/agents/business/nova_bot/config.py (1)
104-106: Consider defensive parsing for numeric environment variables.
int()andfloat()will raiseValueErrorif the environment variable is set to a non-numeric value. Consider adding try/except with fallback to defaults for more graceful handling.🔎 Example defensive parsing
def _safe_int(value: str, default: int) -> int: try: return int(value) except (ValueError, TypeError): return default # Usage: top_k=_safe_int(os.getenv("NOVABOT_TOP_K"), 5),src/agents/business/helpdesk-agent/main.py (1)
30-42: Same shadowedsysimport as in simple-chat-agent.The local
sysimport on line 32 shadows the module-level import on line 24. Consider removing the redundant local import.🔎 Suggested fix
def _load_local_config(): import importlib.util - import sys from pathlib import Pathsrc/api/agent_registry.py (1)
471-493: Add explicitOptionaltype hint forconfig_overridesparameter.The static analysis correctly flags that PEP 484 prohibits implicit
Optional. The parameterconfig_overrides: Dict[str, Any] = Noneshould be explicitly typed asOptional[Dict[str, Any]].Also, line 488 has a redundant
or {}sinceconfig_overridesis already truthy-checked on line 478.🔎 Proposed fix
- async def get_agent_instance(self, agent_id: str, config_overrides: Dict[str, Any] = None): + async def get_agent_instance(self, agent_id: str, config_overrides: Optional[Dict[str, Any]] = None): """ Async wrapper used by the FastAPI app. If config_overrides are provided, a fresh instance is created (not cached) to avoid cross-request config contamination. """ if config_overrides: module = self.load_agent_module(agent_id) agent_info = self.get_agent_info(agent_id) if not module or not agent_info: return None agent_class = getattr(module, agent_info.class_name) config_class = getattr(module, agent_info.config_class) config = config_class.from_env() - for key, value in (config_overrides or {}).items(): + for key, value in config_overrides.items(): if hasattr(config, key): setattr(config, key, value) return agent_class(config) return self.create_agent_instance(agent_id, config_overrides=None)src/agents/business/nova_bot_simulator/config.py (1)
35-37: Consider extracting_repo_rootto a shared utility.This helper is duplicated in
src/agents/business/nova_bot/config.py(lines 14-16). While not critical, extracting it to a shared module would reduce duplication and ensure consistency.src/agents/business/nova_bot/main.py (2)
250-263: Silently swallowing tracing initialization errors may hide configuration issues.While the comment explains that tracing is optional, completely silencing errors makes debugging tracing issues difficult. Consider logging the exception at debug/warning level.
🔎 Proposed improvement
try: import noveum_trace from noveum_trace import NoveumTraceCallbackHandler noveum_trace.init( api_key=self.config.noveum_api_key, project=self.config.noveum_project, environment=self.config.noveum_environment, ) self._noveum_callback = NoveumTraceCallbackHandler() - except Exception: + except Exception as e: # Tracing is optional; never break the bot if tracing deps are missing. + print(f"⚠️ Noveum tracing initialization skipped: {e}") self._noveum_callback = None
449-456: Broad exception catch in RAG retrieval could mask issues.The
except Exceptionblock sets empty context and continues, which is reasonable for resilience, but logging the error would help with debugging retrieval failures.🔎 Proposed improvement
if needs_rag and self._embeddings is not None: try: tool_out = await rag_tool.ainvoke(q, config=config) context = str(tool_out.get("context", "")) sources = list(tool_out.get("sources", [])) or [] chunks_retrieved = int(tool_out.get("chunks_retrieved", 0)) - except Exception: + except Exception as e: + print(f"⚠️ RAG retrieval failed: {e}") context, sources, chunks_retrieved = "", [], 0src/agents/business/nova_bot_simulator/main.py (3)
115-147: URL parsing logic is repetitive.The topic extraction duplicates the same pattern for each URL path segment. Consider refactoring to reduce repetition.
🔎 Proposed refactor
def _extract_topics_from_metadata(metadata: Dict[str, Any]) -> List[str]: """Extract topic keywords from metadata URLs.""" topics = set() items = metadata.get("items", []) + path_segments = [ + "/concepts/", "/best-practices/", "/platform/", + "/getting-started/", "/evaluation/", "/integration-examples/" + ] for item in items: url = item.get("url", "") - # Extract meaningful parts from URLs - if "/concepts/" in url: - topic = url.split("/concepts/")[-1].split("#")[0].split("/")[0] - if topic: - topics.add(topic) - elif "/best-practices/" in url: - # ... repeated pattern + for segment in path_segments: + if segment in url: + topic = url.split(segment)[-1].split("#")[0].split("/")[0] + if topic: + topics.add(topic) + break return sorted(list(topics))
357-366: Question sampling may produce duplicates when pool is exhausted.When
messages_per_sessionexceeds the question pool size, the while loop appends random samples that could include duplicates within the same session. Consider usingrandom.choicesfor explicit repetition or shuffling the extended pool.🔎 Proposed improvement
# Generate questions for this session - session_questions = random.sample( - question_pool, min(self.config.messages_per_session, len(question_pool)) - ) - - # If not enough questions, repeat with shuffling - while len(session_questions) < self.config.messages_per_session: - session_questions.extend( - random.sample(question_pool, min(len(question_pool), self.config.messages_per_session - len(session_questions))) - ) - session_questions = session_questions[:self.config.messages_per_session] + # Allow repetition if pool is smaller than requested count + session_questions = random.choices(question_pool, k=self.config.messages_per_session)
428-434: Remove extraneous f-string prefix.Line 430 has an f-string without any placeholders.
🔎 Proposed fix
- print(f"\n🚀 Starting simulation...") + print("\n🚀 Starting simulation...")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
NoveumDocsData/index/metadata.jsonNoveumDocsData/index/vectors.npyNoveumDocsData/processed/docs.jsonsrc/agents/basic/simple-chat-agent/main.pysrc/agents/business/helpdesk-agent/config.pysrc/agents/business/helpdesk-agent/main.pysrc/agents/business/nova_bot/README.mdsrc/agents/business/nova_bot/agent_info.yamlsrc/agents/business/nova_bot/config.pysrc/agents/business/nova_bot/getstarted.mdsrc/agents/business/nova_bot/main.pysrc/agents/business/nova_bot/requirements.txtsrc/agents/business/nova_bot_simulator/README.mdsrc/agents/business/nova_bot_simulator/agent_info.yamlsrc/agents/business/nova_bot_simulator/config.pysrc/agents/business/nova_bot_simulator/main.pysrc/agents/business/nova_bot_simulator/requirements.txtsrc/api/agent_registry.pysrc/api/main.pysrc/api/requirements.txtsrc/shared/__init__.pysrc/shared/noveum_tracer.py
🧰 Additional context used
🧬 Code graph analysis (8)
src/agents/business/nova_bot/config.py (2)
src/agents/business/nova_bot_simulator/config.py (2)
_repo_root(35-37)from_env(56-75)src/agents/business/helpdesk-agent/config.py (1)
from_env(45-100)
src/api/agent_registry.py (2)
src/api/main.py (1)
get_agent_info(168-176)src/api/client.py (2)
get_agent_info(97-113)get_agent_info(349-361)
src/agents/basic/simple-chat-agent/main.py (2)
src/agents/business/helpdesk-agent/main.py (1)
_load_local_config(30-42)src/agents/business/nova_bot/main.py (1)
_load_local_config(25-38)
src/agents/business/nova_bot/main.py (1)
src/agents/business/nova_bot/config.py (2)
NovaBotConfig(21-107)from_env(78-107)
src/agents/business/helpdesk-agent/config.py (1)
src/shared/config.py (3)
LLMConfig(15-23)MemoryConfig(27-33)NoveumConfig(37-43)
src/agents/business/nova_bot_simulator/config.py (1)
src/agents/business/nova_bot/config.py (3)
NovaBotConfig(21-107)_repo_root(15-17)from_env(78-107)
src/agents/business/nova_bot_simulator/main.py (1)
src/agents/business/nova_bot_simulator/config.py (3)
SimulatorConfig(41-79)get_novabot_config(77-79)from_env(56-75)
src/shared/__init__.py (4)
src/shared/llm_client.py (1)
LLMClient(184-284)src/shared/memory.py (1)
ConversationMemory(236-282)src/shared/noveum_tracer.py (1)
NoveumTracer(16-269)src/shared/config.py (4)
LLMConfig(15-23)MemoryConfig(27-33)NoveumConfig(37-43)ChatAgentConfig(47-220)
🪛 markdownlint-cli2 (0.18.1)
src/agents/business/nova_bot_simulator/README.md
110-110: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.14.10)
src/api/agent_registry.py
471-471: PEP 484 prohibits implicit Optional
Convert to Optional[T]
(RUF013)
src/agents/basic/simple-chat-agent/main.py
29-29: Avoid specifying long messages outside the exception class
(TRY003)
src/agents/business/nova_bot/main.py
33-33: Avoid specifying long messages outside the exception class
(TRY003)
261-261: Do not catch blind exception: Exception
(BLE001)
286-288: Avoid specifying long messages outside the exception class
(TRY003)
372-372: Do not catch blind exception: Exception
(BLE001)
455-455: Do not catch blind exception: Exception
(BLE001)
530-530: Prefer TypeError exception for invalid type
(TRY004)
530-530: Avoid specifying long messages outside the exception class
(TRY003)
src/agents/business/nova_bot_simulator/config.py
24-24: Avoid specifying long messages outside the exception class
(TRY003)
src/agents/business/helpdesk-agent/main.py
38-38: Avoid specifying long messages outside the exception class
(TRY003)
src/shared/noveum_tracer.py
237-237: Do not catch blind exception: Exception
(BLE001)
240-240: Unused method argument: trace_id
(ARG002)
240-240: Unused method argument: result_data
(ARG002)
247-247: Do not catch blind exception: Exception
(BLE001)
250-250: Unused method argument: event_data
(ARG002)
src/agents/business/nova_bot_simulator/main.py
32-32: Avoid specifying long messages outside the exception class
(TRY003)
51-51: Avoid specifying long messages outside the exception class
(TRY003)
399-399: Do not catch blind exception: Exception
(BLE001)
430-430: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (28)
src/agents/business/nova_bot/README.md (1)
1-32: LGTM! Clear and comprehensive documentation.The README provides good coverage of NovaBot's purpose, setup, API routes, and configuration. The environment variables are well-documented with sensible defaults.
src/agents/business/helpdesk-agent/config.py (1)
10-10: LGTM! Import standardization improves maintainability.The change from dynamic sys.path manipulation to direct absolute import from
src.shared.configis a good practice that improves code clarity and maintainability. This aligns with the PR's objective to standardize configuration loading patterns.src/api/main.py (2)
40-40: LGTM! Metadata field addition.The optional metadata field allows passing additional context to agents while maintaining backwards compatibility through the default None value.
264-272: LGTM! Clean convenience endpoint for NovaBot.The dedicated
/novabot/chatendpoint provides a cleaner API surface for NovaBot users while properly delegating to the existingchat_with_agenthandler. This maintains consistency with the generic agent routing pattern.src/api/requirements.txt (1)
24-24: Note: noveum-trace is now a required dependency.The
noveum-trace>=0.1.0dependency is now uncommented/required, making Noveum tracing a mandatory dependency for the API. Ensure this is intentional and won't break deployments where Noveum tracing isn't needed.src/agents/business/nova_bot/agent_info.yaml (1)
1-32: LGTM! Well-structured agent metadata.The agent metadata is comprehensive and follows a clear schema. The input/output contracts are well-defined, and the configuration references align with the NovaBot implementation files mentioned in the AI summary.
src/agents/business/nova_bot_simulator/requirements.txt (1)
1-17: LGTM! Dependencies correctly mirror NovaBot requirements.Since the simulator directly imports
NovaBotAgent, it appropriately includes the same runtime dependencies as NovaBot. The comment clarifying that standard library modules are built-in is helpful.NoveumDocsData/index/metadata.json (1)
1-1017: URL patterns are intentional; no changes needed.The metadata contains both
/en/localized and non-localized URL patterns because the documentation is indexed under both locale versions (e.g.,https://noveum.ai/en/docs/getting-started/sdk-integrationandhttps://noveum.ai/docs/getting-started/sdk-integrationrepresent the same page). The retrieval system correctly extracts and returns URLs as-is from the docs index, treating both patterns as separate indexed documents. This is working as designed.src/agents/basic/simple-chat-agent/main.py (1)
108-115: LGTM!The
process()method provides a clean endpoint-compatible alias, consistent with the pattern used in other agents (e.g., helpdesk-agent).src/agents/business/nova_bot/getstarted.md (1)
1-99: Documentation structure looks good.The guide covers setup, configuration, running the server, and troubleshooting comprehensively.
src/agents/business/helpdesk-agent/main.py (1)
226-240: LGTM!The
chat()method properly extracts the response text from the structured ticket data, with safe fallback handling. Theprocess()alias maintains consistency with the API surface pattern.src/shared/__init__.py (2)
12-20: LGTM!The
__all__exports are consistent with the imports, and the addition ofChatAgentConfigprovides a clean public API for agent configuration.
7-10: The review comment cannot be verified. A search of the codebase reveals:
- No
Memoryclass is currently exported fromsrc/shared- No downstream code imports
Memoryfromsrc.shared- The current
__init__.pyexports onlyConversationMemory, not aMemoryclassThe breaking change claim cannot be substantiated based on the current state of the code.
Likely an incorrect or invalid review comment.
src/agents/business/nova_bot_simulator/agent_info.yaml (1)
1-36: LGTM!The agent metadata is well-structured with clear input/output schema definitions. The configuration aligns with the simulator implementation described in the README.
src/shared/noveum_tracer.py (1)
39-46: SDK initialization correctly follows noveum-trace v1.x pattern.The change to use
noveum_trace.init()with the specified parameters and storing the module reference is consistent with the official v1.x documentation. The defensivehasattrcheck before callingstart_trace()appropriately handles potential method availability.src/api/agent_registry.py (2)
311-336: Good improvement on sys.path cleanup.The try/finally pattern for temporarily adding and removing the agent directory from
sys.pathprevents cross-agent import collisions. This is a solid defensive coding practice.
395-423: Endpoint validation logic is well-structured.The fallback logic correctly handles cases where
processcan fall back tochat, andsupportcan fall back tohandle_support_requestorchat. This aligns with the API layer's fallback behavior mentioned in the comments.src/agents/business/nova_bot_simulator/config.py (2)
16-28: Dynamic config loading pattern is consistent.The importlib-based loading approach matches the pattern used elsewhere in the codebase and handles Python 3.12+ dataclass requirements correctly.
40-79: Config structure and from_env implementation look good.The dataclass correctly separates simulator-specific settings from the underlying NovaBot configuration. The environment variable naming convention (
SIMULATOR_*) is clear and consistent.src/agents/business/nova_bot/main.py (4)
25-42: Dynamic config loading pattern is appropriate for AgentRegistry integration.The comment on lines 23-24 explains the necessity of this approach for importlib-based loading. The pattern correctly handles Python 3.12+ dataclass requirements.
285-288: Good validation for required LLM API key.The agent correctly raises a clear error when neither OpenAI nor Gemini API keys are configured, preventing silent failures during chat operations.
302-343: Chat method implementation is well-structured.The session management with async lock, metadata handling, and response formatting are implemented correctly. The chain invocation properly passes callbacks for tracing.
521-532: Good defensive handling for missing docs file.Returning an empty list when the docs file doesn't exist allows the agent to operate (with degraded RAG) rather than failing entirely.
src/agents/business/nova_bot_simulator/main.py (5)
24-59: Dynamic loading pattern is consistent with other agent modules.The importlib-based loading for both SimulatorConfig and NovaBotAgent follows the established pattern, ensuring compatibility with the AgentRegistry's dynamic module loading.
62-104: Data models are well-structured for simulation metrics.The dataclass hierarchy (InteractionResult → SessionResult → SimulationResults) provides clear separation of concerns for per-message, per-session, and aggregate metrics.
246-298: Comprehensive adversarial test cases.The adversarial questions include system prompt extraction, security probing, confusion attacks, and edge cases (empty string, XSS, SQL injection attempts). This is valuable for testing bot robustness.
449-458: Parallel execution is correctly implemented.Using
asyncio.gatherfor parallel execution and sequential loop for non-parallel mode is the appropriate pattern. Session state isolation is handled through unique session IDs.
491-502: Results persistence handles directory creation correctly.The
mkdir(parents=True, exist_ok=True)ensures the output directory exists before writing, preventing FileNotFoundError.
| # Gemini (fallback if OpenAI not set) | ||
| gemini_api_key: Optional[str] = None | ||
| gemini_model: str = "gemini-1.5-flash" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/agents/business/nova_bot/config.py | sed -n '36,90p'Repository: Noveum/agents-library
Length of output: 2399
🏁 Script executed:
rg "gemini-1\.5-flash|gemini-2\.5-flash" --context 3Repository: Noveum/agents-library
Length of output: 2255
Inconsistent default gemini_model values between dataclass and from_env() method.
Line 38 defaults to "gemini-1.5-flash" but line 88 defaults to "gemini-2.5-flash". This inconsistency causes different behavior depending on instantiation method. Documentation (README.md and getstarted.md) confirms "gemini-1.5-flash" as the intended default, so line 88 should be updated:
- gemini_model=os.getenv("GEMINI_MODEL", "gemini-2.5-flash"),
+ gemini_model=os.getenv("GEMINI_MODEL", "gemini-1.5-flash"),Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/agents/business/nova_bot/config.py lines 36-38 and around line 88, the
dataclass default for gemini_model is "gemini-1.5-flash" but the from_env()
method sets "gemini-2.5-flash"; update the default in from_env() (around line
88) to "gemini-1.5-flash" so both instantiation paths use the documented
intended default.
… README for dataset structure and update requirements for dependencies. Implement NovaBot index building functionality with environment variable checks.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
src/agents/business/nova_bot/README.md (1)
3-3: Minor URL inconsistency with the scraper script.The README references
https://noveum.ai/docs, but the scraper script (scripts/novabot_scrape_docs.py) defaults tohttps://noveum.ai/en/docs. Consider updating the README to match the actual default start URL for consistency.scripts/novabot_build_index.py (2)
105-106: Remove redundant file creation beforenp.save.The
write_bytes(b"")call on line 105 is unnecessary.np.savewill create or overwrite the file directly. This empty write adds no value and could be misleading.🔧 Suggested fix
- (out_dir / "vectors.npy").write_bytes(b"") # ensure writable np.save(out_dir / "vectors.npy", mat)
31-32: Consider usingTypeErrorfor type validation.Per Python conventions (and Ruff TRY004),
TypeErroris more appropriate thanValueErrorwhen validating that data is a specific type (list).🔧 Suggested fix
if not isinstance(data, list): - raise ValueError("docs.json must be a list") + raise TypeError("docs.json must be a list")scripts/novabot_scrape_docs.py (2)
156-157: Usecollections.dequefor efficient queue operations.
queue.pop(0)on a list is O(n). For up to 500 pages, this becomes O(n²) overall. Usingdeque.popleft()provides O(1) performance.🔧 Suggested fix
+from collections import deque # ... - queue: List[str] = [start_url] + queue = deque([start_url]) # ... while queue and len(pages) < args.max_pages: - url = queue.pop(0) + url = queue.popleft()
162-167: Consider logging exceptions instead of silently swallowing them.Catching bare
Exceptionand continuing silently can mask important errors (DNS failures, SSL issues, timeouts). Per Ruff S112/BLE001, at least log the exception for debugging.🔧 Suggested fix
+import logging + +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + try: resp = session.get(url, timeout=args.timeout) - except Exception: + except Exception as e: + logger.warning("Failed to fetch %s: %s", url, e) continue
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
NoveumDocsData/README.mdNoveumDocsData/index/.gitkeepscripts/novabot_build_index.pyscripts/novabot_scrape_docs.pysrc/agents/business/nova_bot/README.mdsrc/agents/business/nova_bot/requirements.txt
✅ Files skipped from review due to trivial changes (2)
- NoveumDocsData/index/.gitkeep
- NoveumDocsData/README.md
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/novabot_scrape_docs.py (1)
scripts/novabot_build_index.py (2)
repo_root(24-26)main(36-118)
🪛 Ruff (0.14.10)
scripts/novabot_build_index.py
32-32: Prefer TypeError exception for invalid type
(TRY004)
32-32: Avoid specifying long messages outside the exception class
(TRY003)
63-63: Avoid specifying long messages outside the exception class
(TRY003)
scripts/novabot_scrape_docs.py
164-165: try-except-continue detected, consider logging the exception
(S112)
164-164: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (5)
src/agents/business/nova_bot/README.md (1)
1-30: Documentation is clear and comprehensive.The README effectively documents the NovaBot's architecture, API routes, and environment variables. Good coverage of required vs. optional configuration.
scripts/novabot_build_index.py (1)
36-118: Overall implementation is well-structured.The script properly validates the API key, handles batch processing, normalizes vectors for cosine similarity, and writes both vectors and metadata. The lazy imports are acceptable for a standalone script.
scripts/novabot_scrape_docs.py (2)
99-123: Chunking logic is correctly implemented.The overlap handling properly advances through the text while ensuring the loop terminates when
endreaches the text length. The logic correctly handles edge cases where text is shorter thanmax_chars.
126-189: Main function is well-structured with sensible defaults.The crawler correctly handles URL canonicalization, respects allowed prefixes, implements polite crawling with sleep delays, and outputs structured JSON for downstream processing.
src/agents/business/nova_bot/requirements.txt (1)
1-14: All LangChain and related package versions exist on PyPI and are available for installation.
…ge project name to "NovaBot" and update Gemini model version. Enhance content hashing method in scraping script and add logging for index bounds checking in vector retrieval.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
scripts/novabot_scrape_docs.py (1)
163-168: Log exceptions for better debugging.The broad exception handling ensures the scraper continues on transient failures, but silent exceptions make debugging difficult when legitimate issues occur (e.g., network configuration, DNS failures, authentication issues).
🔎 Add logging for visibility
+ import logging + logger = logging.getLogger(__name__) + try: resp = session.get(url, timeout=args.timeout) - except Exception: + except Exception as e: + logger.debug(f"Failed to fetch {url}: {e}") continueAlternatively, if you prefer to keep imports at the top:
Add
import loggingat line 17, then:+ logger = logging.getLogger(__name__) + while queue and len(pages) < args.max_pages: url = queue.pop(0) if url in seen: continue seen.add(url) try: resp = session.get(url, timeout=args.timeout) - except Exception: + except Exception as e: + logger.debug(f"Failed to fetch {url}: {e}") continuesrc/agents/business/nova_bot/main.py (1)
531-542: ConsiderTypeErrorfor type validation.Line 540 raises
ValueErrorwhen docs.json is not a list. Since this is a type validation error rather than a value error,TypeErrorwould be more semantically appropriate.🔎 Use TypeError for type mismatch
if not isinstance(data, list): - raise ValueError("docs.json must be a list of chunk objects") + raise TypeError("docs.json must be a list of chunk objects")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
scripts/novabot_scrape_docs.pysrc/agents/business/nova_bot/config.pysrc/agents/business/nova_bot/main.pysrc/agents/business/nova_bot/requirements.txtsrc/api/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (2)
- src/api/requirements.txt
- src/agents/business/nova_bot/requirements.txt
🧰 Additional context used
🧬 Code graph analysis (3)
scripts/novabot_scrape_docs.py (1)
scripts/novabot_build_index.py (1)
repo_root(24-26)
src/agents/business/nova_bot/main.py (1)
src/agents/business/nova_bot/config.py (2)
NovaBotConfig(21-107)from_env(78-107)
src/agents/business/nova_bot/config.py (2)
src/agents/business/nova_bot_simulator/config.py (2)
_repo_root(35-37)from_env(56-75)src/agents/business/helpdesk-agent/config.py (1)
from_env(45-100)
🪛 Ruff (0.14.10)
scripts/novabot_scrape_docs.py
115-115: Probable use of insecure hash functions in hashlib: md5
(S324)
165-166: try-except-continue detected, consider logging the exception
(S112)
165-165: Do not catch blind exception: Exception
(BLE001)
src/agents/business/nova_bot/main.py
34-34: Avoid specifying long messages outside the exception class
(TRY003)
271-271: Do not catch blind exception: Exception
(BLE001)
296-298: Avoid specifying long messages outside the exception class
(TRY003)
382-382: Do not catch blind exception: Exception
(BLE001)
465-465: Do not catch blind exception: Exception
(BLE001)
540-540: Prefer TypeError exception for invalid type
(TRY004)
540-540: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (3)
scripts/novabot_scrape_docs.py (1)
100-124: LGTM! Content hashing fixed as recommended.The implementation now uses
hashlib.md5for stable, reproducible content hashing across script runs, which addresses the previous concern about Python's non-deterministichash(). The MD5 usage here is appropriate since it's for content fingerprinting rather than cryptographic security.The chunking logic correctly handles overlap and prevents infinite loops with proper boundary checks.
src/agents/business/nova_bot/main.py (1)
160-208: LGTM! Bounds checking properly implemented.The retriever now includes defensive bounds checking (lines 189-195) with logging when the vector index and docs list are mismatched, which addresses the previously reported IndexError concern. This ensures robustness against stale or corrupted index data.
src/agents/business/nova_bot/config.py (1)
38-38: Gemini model default is valid and consistent. The dataclass default (line 38) andfrom_envdefault (line 88) both use"gemini-2.5-flash", which is a valid, officially supported Google Gemini model available on both the Gemini API and Vertex AI. No changes needed.
…ment vectors in index building, and improve system instruction generation to accept dynamic agent names. Update NoveumTracer to support concurrent tracing with context management and enhanced event logging capabilities.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/shared/noveum_tracer.py (1)
248-256: Consider removing unusedresult_dataparameter.The
result_dataparameter is no longer used in the new context-manager-based implementation. Since this is an internal method, removing it would simplify the signature, though it's a minor code cleanliness issue.🔎 Optional cleanup
- async def _end_trace(self, trace_id: str, result_data: Dict[str, Any]): + async def _end_trace(self, trace_id: str): """End a trace with results.""" try: # Pop the context manager from the dictionary to support concurrent traces active = self._active_traces.pop(trace_id, None) if active is not None: active.__exit__(None, None, None) except Exception as e: print(f"⚠️ Failed to end trace: {e}")Update call sites (lines 105, 119) accordingly:
- await self._end_trace(trace_id, { - "response": response, - "duration_seconds": duration, - "status": "success", - "response_length": len(response) if response else 0 - }) + await self._end_trace(trace_id)scripts/novabot_build_index.py (1)
29-33: Consider usingTypeErrorfor type validation.Using
TypeErrorinstead ofValueErrorwould be more semantically correct when validating that the data is a list, as this is a type constraint rather than a value constraint.🔎 Suggested change
def load_docs(docs_path: Path) -> List[Dict[str, Any]]: data = json.loads(docs_path.read_text(encoding="utf-8")) if not isinstance(data, list): - raise ValueError("docs.json must be a list") + raise TypeError("docs.json must be a list") return data
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
scripts/novabot_build_index.pysrc/agents/business/nova_bot/main.pysrc/shared/noveum_tracer.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/agents/business/nova_bot/main.py (1)
src/agents/business/nova_bot/config.py (2)
NovaBotConfig(21-107)from_env(78-107)
🪛 Ruff (0.14.10)
scripts/novabot_build_index.py
32-32: Prefer TypeError exception for invalid type
(TRY004)
32-32: Avoid specifying long messages outside the exception class
(TRY003)
63-63: Avoid specifying long messages outside the exception class
(TRY003)
src/agents/business/nova_bot/main.py
34-34: Avoid specifying long messages outside the exception class
(TRY003)
271-271: Do not catch blind exception: Exception
(BLE001)
296-298: Avoid specifying long messages outside the exception class
(TRY003)
382-382: Do not catch blind exception: Exception
(BLE001)
465-465: Do not catch blind exception: Exception
(BLE001)
540-540: Prefer TypeError exception for invalid type
(TRY004)
540-540: Avoid specifying long messages outside the exception class
(TRY003)
src/shared/noveum_tracer.py
245-245: Do not catch blind exception: Exception
(BLE001)
248-248: Unused method argument: result_data
(ARG002)
255-255: Do not catch blind exception: Exception
(BLE001)
270-270: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (11)
src/shared/noveum_tracer.py (3)
29-29: Good addition for concurrent trace support.The
_active_tracesdictionary properly enables multiple concurrent traces by keying ontrace_id, resolving the race condition identified in previous reviews.
258-280: Event logging now properly implemented.The
_log_eventmethod is now fully functional, addressing the previous review concern. The implementation correctly:
- Checks if tracing is enabled
- Calls the SDK's
log_eventmethod when available- Falls back to
trace_eventfor older SDK versions- Handles both sync and async SDK methods
- Uses defensive exception handling to prevent tracing issues from breaking the application
39-47: The noveum-trace SDK v1.x initialization pattern is correct. The code properly callsnoveum_trace.init()with the appropriate parameters (api_key, project, environment) as documented in the official SDK docs, and storing the module asself._clientto access module-level functions likestart_trace(),log_event(), andtrace_event()matches the standard API pattern. No changes needed.scripts/novabot_build_index.py (2)
24-26: LGTM!The
repo_root()helper correctly computes the repository root by navigating up one level from the scripts directory.
100-102: Empty documents case properly handled.The guard at lines 100-102 correctly addresses the previous review concern, preventing
np.vstack(vectors)from failing when no documents are processed. The early exit with status code 1 is appropriate.src/agents/business/nova_bot/main.py (6)
26-43: Well-designed dynamic config loading.The dynamic import approach correctly avoids relative import issues when the agent is loaded by
AgentRegistry. The comment aboutsys.modulesfor Python 3.12+ dataclasses is helpful context. The ImportError on line 34 appropriately fails fast if the config module is missing.
64-110: Agent name mismatch resolved.The system instruction now correctly accepts
agent_nameas a parameter and uses it throughout the prompt (lines 66, 69), addressing the previous review concern. The call site at line 408 properly passesself.agent_namefrom the configuration.
187-207: IndexError risk properly mitigated.The bounds checking at lines 188-195 correctly addresses the previous review concern. The implementation:
- Validates indices before access
- Logs warnings with helpful debugging context
- Continues processing remaining results
- Returns fewer than k results if needed (correct behavior for data integrity)
This defensive approach prevents crashes while providing visibility into index/docs mismatches.
213-246: LGTM! Well-structured initialization.The initialization sequence is logical and properly ordered:
- Configuration and session state
- Optional tracing setup
- Data loading
- LangChain components
- Chain assembly
The dynamic provider indication in the success message (line 246) is helpful for debugging.
312-353: LGTM! Robust chat implementation.The chat method is well-designed with:
- Proper input validation
- Thread-safe session management with async locks
- Insightful comment about not setting global
run_name(line 334)- History trimming to prevent unbounded growth
- Useful debugging metadata in responses
The implementation correctly handles session lifecycle and concurrent access.
395-506: Excellent chain architecture with good observability.The chain implementation demonstrates strong design:
- Tool wrapper for retrieval (lines 442-447) ensures proper trace spans
- Graceful degradation on retrieval errors (lines 465-466)
- Clean separation of concerns: prepare → generate → finalize
- Insightful comments explaining why automatic span naming is preferred (lines 492-505)
The defensive exception handling during retrieval is appropriate - the bot can still attempt to answer without context.
| if self._client and hasattr(self._client, "start_trace"): | ||
| try: | ||
| await self._client.start_trace(trace_data) | ||
| trace_id = trace_data.get("trace_id") | ||
| if not trace_id: | ||
| print("⚠️ No trace_id in trace_data, cannot start trace") | ||
| return | ||
|
|
||
| # start_trace returns a ContextualTrace context manager; keep it for later end. | ||
| ctx = self._client.start_trace( | ||
| name=trace_data.get("agent_name", "agent_interaction"), | ||
| metadata=trace_data, | ||
| ) | ||
| # Enter it to set context | ||
| ctx.__enter__() | ||
| # Store the context manager keyed by trace_id for concurrent trace support | ||
| self._active_traces[trace_id] = ctx | ||
| except Exception as e: | ||
| print(f"⚠️ Failed to start trace: {e}") |
There was a problem hiding this comment.
Potential resource leak if exception occurs between enter and dict storage.
If an exception occurs after ctx.__enter__() (line 242) but before storing in self._active_traces[trace_id] (line 244), the context will be entered but never properly exited. While the window is small, consider wrapping the storage in the same try block or using a more robust pattern.
🔎 Suggested defensive approach
async def _start_trace(self, trace_data: Dict[str, Any]):
"""Start a new trace."""
if self._client and hasattr(self._client, "start_trace"):
try:
trace_id = trace_data.get("trace_id")
if not trace_id:
print("⚠️ No trace_id in trace_data, cannot start trace")
return
# start_trace returns a ContextualTrace context manager; keep it for later end.
ctx = self._client.start_trace(
name=trace_data.get("agent_name", "agent_interaction"),
metadata=trace_data,
)
- # Enter it to set context
- ctx.__enter__()
- # Store the context manager keyed by trace_id for concurrent trace support
- self._active_traces[trace_id] = ctx
+ # Enter and store atomically to prevent leaks
+ ctx.__enter__()
+ try:
+ self._active_traces[trace_id] = ctx
+ except Exception:
+ # If storing fails, exit the context to avoid leaks
+ ctx.__exit__(None, None, None)
+ raise
except Exception as e:
print(f"⚠️ Failed to start trace: {e}")🧰 Tools
🪛 Ruff (0.14.10)
245-245: Do not catch blind exception: Exception
(BLE001)
…ariable dependencies, replacing them with hardcoded defaults. Update NovaBot to support custom Noveum endpoint configuration and enhance session trace management. Modify question generation logic in the simulator for improved topic extraction and question phrasing.
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In `@src/agents/business/nova_bot_simulator/config.py`:
- Around line 53-56: The from_env classmethod on SimulatorConfig is misleading
because its name implies loading from environment variables while the docstring
and implementation just return defaults; update either the method name or
implementation: either rename from_env to from_defaults (or similar) and adjust
the docstring accordingly, or implement actual environment loading inside
SimulatorConfig.from_env by reading expected env vars and constructing cls(...)
with those values; reference SimulatorConfig.from_env and the SimulatorConfig
class when making the change so callers remain consistent.
In `@src/agents/business/nova_bot_simulator/main.py`:
- Line 345: The print statement in main.py uses an unnecessary f-string: remove
the leading "f" from the print call (the line containing print(f"\n🚀 Starting
simulation...")) since there are no placeholders; change it to a plain string in
the same location (the print statement that starts the simulation).
- Around line 223-237: The __init__ currently hardcodes
self.messages_per_session = 3 instead of using the SimulatorConfig value; update
NovaBotSimulator.__init__ to assign self.messages_per_session =
config.messages_per_session so the SimulatorConfig.messages_per_session field is
honored (ensure SimulatorConfig is the same type expected and handle
missing/None by falling back to a default if needed).
In `@src/agents/business/nova_bot/config.py`:
- Around line 77-79: The from_env method on NovaBotConfig currently just returns
cls() but should either be renamed or actually populate fields from environment
variables; update NovaBotConfig.from_env to mirror the pattern used in
helpdesk-agent config by importing os (or using the existing env helpers) and
reading the expected env keys into the class constructor (e.g., map variables
like NOVA_BOT_API_KEY, NOVA_BOT_MODEL, etc., to the NovaBotConfig fields) and
return cls(...) with those values, or if you choose to rename, change the method
to a clearer name like default() and leave from_env unimplemented/removed to
avoid confusion. Ensure you reference the NovaBotConfig.from_env method and any
field names in the class when implementing the env reads so tests and call sites
continue to work.
- Around line 27-38: The config file currently hardcodes API keys
(noveum_api_key, openai_api_key, gemini_api_key) in NovaBotConfig; change
NovaBotConfig (and its from_env classmethod) to read these values from
environment variables (e.g., NOVEUM_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY)
with empty-string defaults or optional validation, and remove the literal
secret-like defaults from the class attributes so secrets are not stored in
source; optionally make required keys raise a clear error in from_env when
missing.
In `@src/agents/business/nova_bot/main.py`:
- Around line 307-317: The _create_embeddings method currently only checks if
self.config.openai_api_key is truthy, which misses placeholder strings; update
_create_embeddings to treat placeholder keys as invalid by verifying the key is
non-empty and looks like a real OpenAI secret (e.g., startswith "sk-") and/or
does not contain placeholder substrings like "OPENAI" or "YOUR_OPENAI"; only
then instantiate OpenAIEmbeddings with model=self.config.openai_embedding_model
and api_key=self.config.openai_api_key.
- Around line 246-247: The provider-detection uses a truthy check on
self.config.openai_api_key which misidentifies placeholder keys like "sk-***" as
valid; update the logic in the initialization (the llm_provider assignment and
the print that uses self.agent_name) to verify the API key is a real-looking key
(e.g., non-empty and not a placeholder—check it does not contain '*' and
optionally startswith("sk-")) or change the config default to an empty string so
the current truthy check works; adjust the conditional that sets llm_provider
accordingly (refer to self.config.openai_api_key and the llm_provider variable).
🧹 Nitpick comments (4)
src/agents/business/nova_bot_simulator/config.py (2)
14-30: Consider using standard imports instead of dynamicimportlibloading.The dynamic import via
importlib.utilis unusual when a standard relative import would work. This pattern adds complexity, makes IDE navigation harder, and bypasses Python's normal module resolution.A simpler approach:
from ..nova_bot.config import NovaBotConfigIf this is intentional to avoid circular imports or handle dynamic loading scenarios (e.g., the AgentRegistry), please add a comment explaining why.
33-35:_repo_root()is duplicated fromnova_bot/config.py.This helper function appears identically in both config files. Consider extracting it to a shared utility module to reduce duplication and ensure consistency.
src/agents/business/nova_bot_simulator/main.py (1)
314-328: Consider logging the exception details for debugging.The broad
except Exceptioncatch is reasonable for simulation resilience, but silently storing onlystr(e)loses the traceback. Consider logging the full exception for debugging purposes.🔍 Proposed improvement
+ import logging + logger = logging.getLogger(__name__) except Exception as e: + logger.exception(f"Error in session {session_id} message {msg_idx}") error_count += 1 interactions.append( InteractionResult(src/agents/business/nova_bot/main.py (1)
337-342: Consider logging silenced tracing exceptions.Multiple
try-except-passblocks (lines 340, 352, 380) silently swallow exceptions. While tracing is optional and shouldn't break the bot, logging these exceptions at DEBUG level would help troubleshoot tracing issues.🔍 Proposed improvement
+logger = logging.getLogger(__name__) + if session_id and reset: async with self._sessions_lock: self._sessions.pop(session_id, None) if self._noveum_callback and session_id in self._active_traces: try: self._noveum_callback.end_trace() except Exception: - pass + logger.debug("Failed to end trace on session reset", exc_info=True) self._active_traces.pop(session_id, None)Also applies to: 349-353, 376-382
| @classmethod | ||
| def from_env(cls) -> "SimulatorConfig": | ||
| """Load configuration from local defaults.""" | ||
| return cls() |
There was a problem hiding this comment.
from_env() docstring is misleading—it loads defaults, not environment variables.
Same issue as in nova_bot/config.py: the method name and docstring ("Load configuration from local defaults") don't align. The name suggests environment variable loading, but it returns the default instance.
🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot_simulator/config.py` around lines 53 - 56, The
from_env classmethod on SimulatorConfig is misleading because its name implies
loading from environment variables while the docstring and implementation just
return defaults; update either the method name or implementation: either rename
from_env to from_defaults (or similar) and adjust the docstring accordingly, or
implement actual environment loading inside SimulatorConfig.from_env by reading
expected env vars and constructing cls(...) with those values; reference
SimulatorConfig.from_env and the SimulatorConfig class when making the change so
callers remain consistent.
| def __init__(self, config: SimulatorConfig): | ||
| self.config = config | ||
| self.agent_name = "NovaBotSimulator" | ||
| self.messages_per_session = 3 | ||
|
|
||
| # Load docs metadata for question generation | ||
| self.metadata = _load_docs_metadata(config.metadata_json_path) | ||
| self.topics = _extract_topics_from_metadata(self.metadata) | ||
| self.normal_questions = _generate_normal_questions(self.topics) | ||
| self.adversarial_questions = _generate_adversarial_questions() | ||
|
|
||
| print(f"🤖 {self.agent_name} initialized") | ||
| print(f" Normal questions: {len(self.normal_questions)}") | ||
| print(f" Adversarial questions: {len(self.adversarial_questions)}") | ||
| print(f" Sessions: {config.num_sessions}, Messages per session: {self.messages_per_session}") |
There was a problem hiding this comment.
messages_per_session is hardcoded, ignoring the config value.
Line 226 sets self.messages_per_session = 3 but SimulatorConfig has a field messages_per_session (default 3). The config value is never used, making the config field misleading.
🛠️ Proposed fix
def __init__(self, config: SimulatorConfig):
self.config = config
self.agent_name = "NovaBotSimulator"
- self.messages_per_session = 3
+ self.messages_per_session = config.messages_per_session🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot_simulator/main.py` around lines 223 - 237, The
__init__ currently hardcodes self.messages_per_session = 3 instead of using the
SimulatorConfig value; update NovaBotSimulator.__init__ to assign
self.messages_per_session = config.messages_per_session so the
SimulatorConfig.messages_per_session field is honored (ensure SimulatorConfig is
the same type expected and handle missing/None by falling back to a default if
needed).
|
|
||
| async def run_simulation(self) -> SimulationResults: | ||
| """Run the complete simulation with all sessions.""" | ||
| print(f"\n🚀 Starting simulation...") |
There was a problem hiding this comment.
Remove extraneous f prefix from string without placeholders.
This f-string has no placeholders, making the f prefix unnecessary.
🛠️ Proposed fix
- print(f"\n🚀 Starting simulation...")
+ print("\n🚀 Starting simulation...")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"\n🚀 Starting simulation...") | |
| print("\n🚀 Starting simulation...") |
🧰 Tools
🪛 Ruff (0.14.14)
[error] 345-345: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot_simulator/main.py` at line 345, The print
statement in main.py uses an unnecessary f-string: remove the leading "f" from
the print call (the line containing print(f"\n🚀 Starting simulation...")) since
there are no placeholders; change it to a plain string in the same location (the
print statement that starts the simulation).
| noveum_api_key: str = "*****-xC" | ||
| noveum_project: str = "NovaBot-demo" | ||
| noveum_environment: str = "dev-Novabot" | ||
| noveum_endpoint: str = "" | ||
|
|
||
| # LLM provider (answer generation) - supports OpenAI or Gemini | ||
| openai_api_key: str = "sk-***" | ||
| openai_model: str = "gpt-4o-mini" | ||
|
|
||
| # Gemini (fallback if OpenAI not set) | ||
| gemini_api_key: str = "******-" | ||
| gemini_model: str = "gemini-2.5-flash" |
There was a problem hiding this comment.
Hardcoded API keys in source code pose a security risk.
Lines 27, 33, and 37 contain hardcoded API key values. Even if these are placeholder/redacted values, committing secrets (or patterns resembling secrets) to source control is a security anti-pattern:
- They may be accidentally replaced with real keys
- They establish a pattern that encourages storing secrets in code
- Static analysis tools and secret scanners will flag these
Consider using environment variables with empty string defaults, or raise an error prompting the user to set required keys.
🔒 Proposed fix to use environment variables
+import os
+
`@dataclass`
class NovaBotConfig:
"""Runtime configuration for NovaBot."""
agent_name: str = "NovaBot"
# Noveum tracing (LangChain integration)
noveum_enabled: bool = True
- noveum_api_key: str = "*****-xC"
+ noveum_api_key: str = ""
noveum_project: str = "NovaBot-demo"
noveum_environment: str = "dev-Novabot"
noveum_endpoint: str = ""
# LLM provider (answer generation) - supports OpenAI or Gemini
- openai_api_key: str = "sk-***"
+ openai_api_key: str = ""
openai_model: str = "gpt-4o-mini"
# Gemini (fallback if OpenAI not set)
- gemini_api_key: str = "******-"
+ gemini_api_key: str = ""
gemini_model: str = "gemini-2.5-flash"Then in from_env():
`@classmethod`
def from_env(cls) -> "NovaBotConfig":
return cls(
noveum_api_key=os.getenv("NOVEUM_API_KEY", ""),
openai_api_key=os.getenv("OPENAI_API_KEY", ""),
gemini_api_key=os.getenv("GEMINI_API_KEY", ""),
# ... other env-driven fields
)🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot/config.py` around lines 27 - 38, The config file
currently hardcodes API keys (noveum_api_key, openai_api_key, gemini_api_key) in
NovaBotConfig; change NovaBotConfig (and its from_env classmethod) to read these
values from environment variables (e.g., NOVEUM_API_KEY, OPENAI_API_KEY,
GEMINI_API_KEY) with empty-string defaults or optional validation, and remove
the literal secret-like defaults from the class attributes so secrets are not
stored in source; optionally make required keys raise a clear error in from_env
when missing.
| @classmethod | ||
| def from_env(cls) -> "NovaBotConfig": | ||
| return cls() |
There was a problem hiding this comment.
from_env() method is misleading—it doesn't read environment variables.
The method name and the pattern from other config files (e.g., src/agents/business/helpdesk-agent/config.py) suggests this should load configuration from environment variables. Currently it just returns the default instance, making from_env() functionally identical to cls().
Either rename the method to clarify its behavior, or implement actual environment variable loading to match the established pattern in this codebase.
🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot/config.py` around lines 77 - 79, The from_env
method on NovaBotConfig currently just returns cls() but should either be
renamed or actually populate fields from environment variables; update
NovaBotConfig.from_env to mirror the pattern used in helpdesk-agent config by
importing os (or using the existing env helpers) and reading the expected env
keys into the class constructor (e.g., map variables like NOVA_BOT_API_KEY,
NOVA_BOT_MODEL, etc., to the NovaBotConfig fields) and return cls(...) with
those values, or if you choose to rename, change the method to a clearer name
like default() and leave from_env unimplemented/removed to avoid confusion.
Ensure you reference the NovaBotConfig.from_env method and any field names in
the class when implementing the env reads so tests and call sites continue to
work.
| llm_provider = "OpenAI" if self.config.openai_api_key else "Gemini" | ||
| print(f"🛰️ {self.agent_name} initialized (LangChain-only {llm_provider} + OpenAIEmbeddings RAG)") |
There was a problem hiding this comment.
Provider detection will incorrectly select OpenAI when using placeholder keys.
Line 246 checks if self.config.openai_api_key to determine the LLM provider. However, the config defaults include placeholder values like "sk-***" which are truthy, causing OpenAI to always be selected even when the user intends to use Gemini.
This ties into the earlier security concern—if you use empty strings as defaults for API keys, this logic would work correctly.
🛠️ Proposed fix
- llm_provider = "OpenAI" if self.config.openai_api_key else "Gemini"
+ # Check for actual API keys, not placeholder values
+ has_openai = self.config.openai_api_key and not self.config.openai_api_key.startswith("sk-***")
+ llm_provider = "OpenAI" if has_openai else "Gemini"Or better, use empty string defaults in config and this check works as-is.
🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot/main.py` around lines 246 - 247, The
provider-detection uses a truthy check on self.config.openai_api_key which
misidentifies placeholder keys like "sk-***" as valid; update the logic in the
initialization (the llm_provider assignment and the print that uses
self.agent_name) to verify the API key is a real-looking key (e.g., non-empty
and not a placeholder—check it does not contain '*' and optionally
startswith("sk-")) or change the config default to an empty string so the
current truthy check works; adjust the conditional that sets llm_provider
accordingly (refer to self.config.openai_api_key and the llm_provider variable).
| def _create_llm(self): | ||
| # Prefer OpenAI if available, otherwise fall back to Gemini | ||
| if self.config.openai_api_key: | ||
| from langchain_openai import ChatOpenAI | ||
|
|
||
| return ChatOpenAI( | ||
| model=self.config.openai_model, | ||
| api_key=self.config.openai_api_key, | ||
| temperature=self.config.temperature, | ||
| max_tokens=self.config.max_output_tokens, | ||
| ) | ||
| elif self.config.gemini_api_key: | ||
| from langchain_google_genai import ChatGoogleGenerativeAI | ||
|
|
||
| return ChatGoogleGenerativeAI( | ||
| model=self.config.gemini_model, | ||
| google_api_key=self.config.gemini_api_key, | ||
| temperature=self.config.temperature, | ||
| max_output_tokens=self.config.max_output_tokens, | ||
| ) | ||
| else: | ||
| raise ValueError( | ||
| "Missing LLM API key. Set either OPENAI_API_KEY or GEMINI_API_KEY." | ||
| ) |
There was a problem hiding this comment.
Same provider detection issue in _create_llm().
The truthy check on self.config.openai_api_key (line 284) will always pass with placeholder values, causing Gemini to never be used as fallback. The error on line 303-305 will also never trigger since the placeholder key is truthy.
🧰 Tools
🪛 Ruff (0.14.14)
[warning] 303-305: Avoid specifying long messages outside the exception class
(TRY003)
| def _create_embeddings(self): | ||
| if not self.config.openai_api_key: | ||
| # Allow running without retrieval (RAG will return empty context) | ||
| return None | ||
|
|
||
| from langchain_openai import OpenAIEmbeddings | ||
|
|
||
| return OpenAIEmbeddings( | ||
| model=self.config.openai_embedding_model, | ||
| api_key=self.config.openai_api_key, | ||
| ) |
There was a problem hiding this comment.
Embeddings creation also affected by placeholder key issue.
Line 308's check if not self.config.openai_api_key will never be true with placeholder values, so embeddings will always be created even when the key is invalid.
🤖 Prompt for AI Agents
In `@src/agents/business/nova_bot/main.py` around lines 307 - 317, The
_create_embeddings method currently only checks if self.config.openai_api_key is
truthy, which misses placeholder strings; update _create_embeddings to treat
placeholder keys as invalid by verifying the key is non-empty and looks like a
real OpenAI secret (e.g., startswith "sk-") and/or does not contain placeholder
substrings like "OPENAI" or "YOUR_OPENAI"; only then instantiate
OpenAIEmbeddings with model=self.config.openai_embedding_model and
api_key=self.config.openai_api_key.
NovaBot and NovaBot Simulator agents - chatbot based on api.
Summary by CodeRabbit
New Features
Documentation
API Improvements
Chores