Skip to content

Conversation entities: first-class LLM conversation storage#602

Open
crspeller wants to merge 45 commits intomasterfrom
conversation-entities
Open

Conversation entities: first-class LLM conversation storage#602
crspeller wants to merge 45 commits intomasterfrom
conversation-entities

Conversation

@crspeller
Copy link
Copy Markdown
Member

@crspeller crspeller commented Apr 8, 2026

Summary

Conversations become first-class entities stored in plugin tables (LLM_Conversations, LLM_Turns). Posts remain display artifacts, but the conversation model is the source of truth. A unified ToolRunner replaces the divergent auto-run code paths, and thread analysis, channel analysis, and search are all modeled as conversations.

Key changes:

  • Schema — New LLM_Conversations and LLM_Turns tables with JSONB content blocks. LLM_PostMeta dropped.
  • Content blocks — Typed JSONB blocks (text, thinking, tool_use, tool_result, file, image, annotations) with block-level shared flag for channel privacy.
  • Unified ToolRunner — Single call-execute-recall loop replacing WithAutoRunTools(), wrapStreamWithMCPAutoApproval(), and all provider-level auto-run handling.
  • Conversation service — Creates/continues conversations, builds CompletionRequest from turns (not posts), writes tool turns, generates titles, merges channel mention context.
  • API privacy filteringGET /conversations/{id} returns full content to owner, redacts unshared tool data for other channel members.
  • Streaming turn persistence — Placeholder turn at stream start, finalized with content blocks + token counts at stream end.
  • DM conversations — DM path uses conversation service + ToolRunner instead of PostToAIPost/ThreadToLLMPosts.
  • Channel mentions — Multi-bot thread isolation via (RootPostID, BotID) lookup. Tool blocks default to shared=false in channels.
  • Thread analysis, channel analysis, search — All create conversation entities with appropriate Operation types.
  • WebappLLMBotPost rewritten to fetch from conversation API via useConversation hook. RHS threads backed by LLM_Conversations.
  • Cleanup — Removed ~5,700 lines of old code: auto-run wrappers, KV store tool data, post-based conversation reconstruction, legacy post props.

What stays independent (not converted): Meeting summarization, emoji reactions, title generation, bridge API.

Ticket Link

N/A

Screenshots

N/A — no visual changes to the UI. Data source changed from post props to conversation entity API.

Release Note

Conversations are now stored as first-class entities in new database tables (`LLM_Conversations`, `LLM_Turns`). Added new `GET /conversations/{id}` API endpoint with privacy filtering. Added new database migrations (000004-000006). Added `conversation_id` post prop. Removed `LLM_PostMeta` table. Pre-v2 bot posts render message text only (no tool calls, reasoning, or annotations).

Summary by CodeRabbit

Release Notes

  • New Features

    • Added persistent conversation storage with turn history tracking for direct messages and channel mentions.
    • Implemented conversation-based tool approval/rejection workflow allowing users to accept or decline tool calls.
    • Integrated search results directly into conversation history with persistent context.
  • Enhancements

    • Improved tool call tracking with clearer separation of execution results and conversation flow.
    • Enhanced channel analysis by creating conversation records for auditability.
    • Better support for continuing conversations while maintaining context across interactions.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

🤖 LLM Evaluation Results

OpenAI

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ OPENAI 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

OPENAI

1. TestReactEval/[openai]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text token "heart_eyes_cat" rather than an actual cat emoji (e.g., 😺) or a heart/love emoji (e.g., ❤️). Therefore it does not contain a cat emoji or heart/love emoji.

2. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not actually provide a list of bugs; it states it cannot pull bug reports and provides an empty, ready-to-fill template instead of populated bug entries.

3. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output does not include descriptions of each bug; it states it cannot pull bug reports and provides an empty template without any actual bug entries or their descriptions.

4. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output provides a template with a “Reported by” column but does not actually attribute any specific bug to a user because no bugs are listed or mapped to reporters.

5. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output does not mention the specific bug about trying to save without a color or the save button doing nothing, nor does it attribute any bug to @maria.nunez. It only provides a generic template and asks the user to paste bug entries.

6. TestConversationMentionHandling/[openai]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output does not mention any specific bug about an end user being able to change the channel banner, nor does it attribute such a bug to @maria.nunez. It only provides a generic template and asks the user to paste bug entries.

7. TestDirectMessageConversations/[openai]_bot_dm_tool_introspection

  • Score: 0.50
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to documentation (docs.mattermost.com) but does not mention GitHub anywhere. Since the rubric requires both mentioning Github and referring to the documentation, it fails.

Anthropic

⚠️ Overall: 21/28 tests passed (75.0%)

Provider Total Passed Failed Pass Rate
⚠️ ANTHROPIC 28 21 7 75.0%

❌ Failed Evaluations

Show 7 failures

ANTHROPIC

1. TestReactEval/[anthropic]_react_cat_message

  • Score: 0.00
  • Rubric: The word/emoji is a cat emoji or a heart/love emoji
  • Reason: The output is the text string "heart_eyes_cat" rather than an actual cat emoji (e.g., 😺/🐱) or a heart/love emoji (e.g., ❤️/😍).

2. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: is a list of bugs
  • Reason: The output does not provide a list of bugs; it states lack of access to trackers and suggests where to look, without enumerating any bug items.

3. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: includes a description of each bug
  • Reason: The output only states it cannot access bug tracking systems and suggests where to look; it does not provide any bug descriptions at all.

4. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes each bug to a user
  • Reason: The output does not list any bugs and therefore does not attribute each bug to a user. It only explains limitations and suggests ways to gather bug reports.

5. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: attributes the bug about trying to save without a color and the save button not doing anything to @maria.nunez
  • Reason: The output explains lack of access to bug trackers and suggests ways to retrieve bug reports, but it does not mention @maria.nunez or attribute any bug (saving without a color / save button doing nothing) to her.

6. TestConversationMentionHandling/[anthropic]_conversation_from_attribution_long_thread.json

  • Score: 0.00
  • Rubric: the bug about the end user being able to change channel banner is attributed to @maria.nunez
  • Reason: The output only states it cannot access bug trackers and suggests ways to find bugs; it does not mention the specific bug about changing the channel banner or attribute it to @maria.nunez.

7. TestDirectMessageConversations/[anthropic]_bot_dm_tool_introspection

  • Score: 0.00
  • Rubric: mentions Github and refers to the documentation
  • Reason: The output refers to the documentation by linking to docs.mattermost.com, but it does not mention GitHub anywhere. Since the rubric requires both mentioning GitHub and referring to the documentation, it fails.

This comment was automatically generated by the eval CI pipeline.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c10461e90c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread conversation/service.go
Comment on lines +218 to +220
if err != nil {
return nil, err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recover from unique-conflict in get-or-create conversation flow

When two requests hit the same (RootPostID, BotID) concurrently, both can pass the initial lookup and race on insert; the second insert will hit the unique index (idx_llm_conversations_thread_bot) and this branch returns the error directly instead of reloading the existing conversation and appending the user turn. In that case the user request fails even though a valid conversation now exists, so channel/DM replies can be dropped under normal concurrent usage.

Useful? React with 👍 / 👎.

Comment thread conversations/tool_approval.go Outdated
Comment on lines +219 to +220
// Continue the LLM loop with the conversation context.
return c.streamToolFollowUp(bot, user, channel, post, conv, isDM)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Apply tool-result approvals before continuing follow-up

This handler accepts acceptedToolIDs for the result-sharing step but never uses them, and immediately continues with streamToolFollowUp. As a result, user decisions in the second approval stage have no effect: tool blocks are never marked shared for approved results, so non-requesters keep seeing redacted tool arguments/results even after approval.

Useful? React with 👍 / 👎.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

A major architectural refactor introduces conversation-store-backed persistence for LLM interactions, replacing streaming-layer post-property tracking with database-persisted conversation turns. A new conversation.Service and toolrunner package handle sequential tool execution, while the API, channels, search, and thread analysis components are updated to coordinate through the conversation store instead of direct LLM calls.

Changes

Cohort / File(s) Summary
Conversation Service Architecture
conversation/service.go, conversation/service_test.go, conversation/content_block.go, conversation/content_block_test.go, conversation/convert.go, conversation/convert_test.go, conversation/helpers.go, conversation/helpers_test.go
New conversation service implementing turn/content persistence, LLM request building, tool result recording, and title generation via Store and BotLookup interfaces. Includes content block marshalling/unmarshalling, conversions between llm.Post and ContentBlock structures, and comprehensive test coverage with real PostgreSQL backend.
Database Schema & Migrations
store/conversations.go, store/conversations_test.go, store/turns.go, store/turns_test.go, store/migrations/000005*, store/migrations/000006*, store/migrations/000007*, store/migrations/000008*, store/migrations/000009*, store/migrations/000010*
New LLM_Conversations and LLM_Turns tables with SQL migrations for creation, indexing, deduplication, and legacy LLM_PostMeta cleanup. Includes auto-sequencing retry logic for concurrent turn creation and soft-delete support.
Tool Execution Runner
toolrunner/toolrunner.go, toolrunner/toolrunner_test.go
New toolrunner package implementing iterative tool-call execution loops, streaming result forwarding, max-rounds enforcement, consecutive failure tracking, and tool disabling. Replaces llm.AutoRunToolsWrapper logic.
API Layer Updates
api/api.go, api/api_conversation.go, api/api_conversation_test.go, api/api_channel.go, api/api_post.go, api/api_test.go, api/api_no_tools_test.go, api/api_admin_test.go, api/api_search_test.go
Adds GET /conversations/:conversationid handler with authorization checks and turn filtering for non-owners. Removes private tool endpoints. Updates channel analysis and DM handling to use conversation service and stream results from AnalysisResult. Integrates ConversationStore dependency and conversation ownership validation.
Channel & Thread Analysis
channels/channels.go, channels/analysis_conversation_test.go, channels/channels_eval_test.go, threads/threads.go, threads/analysis_conversation_test.go, threads/threads_test.go
Refactors channel interval analysis and thread analysis to create/persist conversations, build completion requests from stored turns, and return AnalysisResult with conversation ID alongside stream. Removes direct llm.CompletionRequest construction in favour of conversation-backed flows.
Conversation Message Handling
conversations/handle_messages.go, conversations/conversations.go, conversations/regeneration.go, conversations/tool_approval.go, conversations/store.go, conversations/web_search_context.go, conversations/test_helpers_test.go, conversations/dm_conversation_test.go, conversations/channel_mention_test.go
Splits DM/channel-mention handling into conversation-creation and request-processing phases. Adds tool approval/rejection flows via HandleToolCall and HandleToolResult that update turn content and trigger follow-up streaming. Refactors title saving from async post-meta upserts to conversation-entity updates. Removes old tool-handling KV/redaction logic and respondToPost helper.
Search Integration
search/search.go, search/search_test.go, search/search_conversation_test.go, search/search_citation_eval_test.go
Adds conversationService dependency to Search. Routes embedding-based and non-streaming search queries through conversation creation and turn persistence when service is available, falling back to direct LLM calls when nil. Removes legacy SearchQueryProp post property usage.
Streaming & Turn Persistence
streaming/streaming.go, streaming/post_modifier.go, streaming/turn_persistence_test.go, streaming/test_helpers_test.go
Refactors StreamToPost to accumulate assistant text/reasoning/tool-calls into turn content blocks and persist via TurnStore callback (when configured). Removes MCP auto-approval, tool-call redaction, KV storage, and channel/DM tool-call branching. Adds ConversationIDProp constant for post-conversation linkage.
Removed LLM Auto-Run Wrappers
llm/auto_run_tools.go, llm/auto_run_tools_test.go, llm/tools.go, llm/language_model.go, bots/bots.go
Deletes AutoRunToolsWrapper, WithAutoRunTools option, and helper utilities (HasPreExecutedToolCalls, ExecuteAutoRunTools, EnrichToolCallsWithServerOrigin). Removes AutoRunTools config field. Updates bot LLM initialization to skip auto-run wrapper wrapping.
MCP Tool Policy
mcp/tool_policy.go, conversations/bot_channel_tool_filter.go
Introduces mcp.ToolPolicyChecker interface and ToolPolicyFunc type for tool policy queries. Updates tool-filter checker parameter type from streaming.ToolPolicyChecker to mcp.ToolPolicyChecker and adds applyToolAvailability helper to manage DisabledToolsInfo.
Test Cleanup & Helpers
conversations/conversations_test.go, conversations/direct_message_eval_test.go, conversations/handle_messages_placeholder_test.go, conversations/mcp_auto_approval.go, conversations/mcp_auto_approval_test.go, conversations/tool_handling.go, conversations/tool_handling_test.go, conversations/tool_permissions.go, e2e/tests/semantic-search/search-sources.spec.ts, llm/tools_test.go, streaming/auto_approval_test.go, streaming/tool_call_redaction_test.go, postgres/pgvector_test.go, meetings/transcriptions.go, mcpserver/eval_helpers_test.go, mcpserver/tools_eval_test.go, mcp/testhelpers_test.go
Removes legacy auto-approval/tool-handling test files and placeholder-streaming tests. Updates test invocations to use toolrunner instead of AutoRunToolsWrapper. Refactors DM/message eval tests to call LLM directly with conversation-based post sequences. Adds turn_persistence_test.go and in-memory fake implementations for conversation store. Updates pgvector test to skip instead of fail when extension is unavailable. Improves Testcontainers retry logic in mcp/testhelpers_test.go.
Server Wiring
server/main.go, webapp/jest.config.js
Adds conversation.Service creation and propagation to conversationsService, searchService, and apiService. Updates MCP tool policy wiring to use mcp.ToolPolicyFunc. Replaces DeletePostMetaForDeletedPost call with DeleteConversationsForDeletedPost. Adds Jest configuration for TypeScript transformation and module path aliases.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant API as API Layer
    participant ConvService as Conversation Service
    participant Store as PostgreSQL Store
    participant LLM as Language Model
    participant ToolRunner as Tool Runner

    User->>API: POST /conversations/<id>/turns/<turnid>/approve
    API->>ConvService: GetConversation(convID)
    ConvService->>Store: GetConversation(convID)
    Store-->>ConvService: Conversation
    ConvService-->>API: Conversation

    API->>ConvService: GetTurns(convID)
    ConvService->>Store: GetTurnsForConversation(convID)
    Store-->>ConvService: []Turn
    ConvService-->>API: []Turn

    API->>ConvService: BuildCompletionRequest(conv, context)
    ConvService->>ConvService: unmarshalBlocks() + convert to llm.Post
    ConvService-->>API: CompletionRequest

    API->>ToolRunner: Run(CompletionRequest, shouldExecute)
    ToolRunner->>LLM: ChatCompletion(request)
    LLM-->>ToolRunner: TextStreamResult (tool calls)

    ToolRunner->>ToolRunner: ResolveTool() for each call
    ToolRunner->>ConvService: onToolTurns callback
    ConvService->>Store: CreateTurnAutoSequence (tool_result)
    Store-->>ConvService: Turn with Sequence

    ToolRunner-->>API: ToolRunResult
    API->>ConvService: WriteToolTurns(convID, turns, shared=true)
    ConvService->>Store: UpdateTurnContent()
    Store-->>ConvService: OK

    API-->>User: Response with streamed continuation
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • PR #520: Modifies MCP tool policy interfaces, server-origin handling, and integrates per-tool policy changes across streaming, conversations, llm, and mcp layers with overlapping auto-run/policy semantics.
  • PR #591: Adjusts conversations and streaming tool auto-run/auto-share behaviours via expanded MCP tooling, touching the same policy-related code paths.
  • PR #503: Modifies api/api.go constructor to accept additional dependencies (store/config), similar wiring pattern to the conversation service integration in this PR.

Suggested labels

Setup Cloud Test Server

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch conversation-entities

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
search/search.go (1)

480-486: ⚠️ Potential issue | 🟡 Minor

Inconsistent tool disabling between conversation and fallback paths.

When conversationService is available, ChatCompletionNoStream is called with llm.WithToolsDisabled() (line 460), but the fallback path at line 483 doesn't disable tools. This could lead to different behavior depending on whether the conversation service is configured.

Proposed fix for consistency
 	// Fallback: direct LLM call without conversation tracking
-	answer, err := bot.LLM().ChatCompletionNoStream(prompt)
+	answer, err := bot.LLM().ChatCompletionNoStream(prompt, llm.WithToolsDisabled())
 	if err != nil {
 		return Response{}, fmt.Errorf("failed to generate answer: %w", err)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@search/search.go` around lines 480 - 486, The fallback path calls
bot.LLM().ChatCompletionNoStream without disabling tools, causing inconsistent
behavior vs the conversationService path which uses llm.WithToolsDisabled();
update the fallback call to include the same option (pass
llm.WithToolsDisabled() to ChatCompletionNoStream) so both paths use identical
tool-disabled semantics, ensuring consistency with the conversation branch that
invokes ChatCompletionNoStream with llm.WithToolsDisabled().
🟡 Minor comments (11)
e2e/tests/semantic-search/search-sources.spec.ts-108-108 (1)

108-108: ⚠️ Potential issue | 🟡 Minor

Scope the response-text assertion to RHS to avoid false positives.

Line 108 uses a page-wide text lookup; if the same text appears elsewhere, this can pass incorrectly.

🔧 Proposed fix
-        await expect(page.getByText(searchResponseWithSourcesText)).toBeVisible();
+        const rhs = page.getByTestId('mattermost-ai-rhs');
+        await expect(rhs.getByText(searchResponseWithSourcesText)).toBeVisible();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@e2e/tests/semantic-search/search-sources.spec.ts` at line 108, The assertion
uses a global page.getByText(searchResponseWithSourcesText) which can match the
same text anywhere; scope it to the RHS container by locating the RHS (e.g., the
right-hand pane locator used elsewhere in the test suite or a stable
selector/component for the RHS) and call
getByText(searchResponseWithSourcesText) on that RHS locator instead of page so
the assertion only checks visibility inside the right-hand results area.
mcp/testhelpers_test.go-124-132 (1)

124-132: ⚠️ Potential issue | 🟡 Minor

Use a fresh cleanup context for termination on partial startup failures.

On Line 124 and Line 131, Terminate reuses ctx, which may already be expired. That can skip cleanup and leak containers across retries.

🔧 Suggested fix
  serverURL, err := container.URL(ctx)
  if err != nil {
-		_ = container.Terminate(ctx)
+		cleanupCtx, cleanupCancel := context.WithTimeout(context.Background(), 30*time.Second)
+		_ = container.Terminate(cleanupCtx)
+		cleanupCancel()
 		return nil, fmt.Errorf("failed to get server URL: %w", err)
  }

  // Get admin client
  adminClient, err := container.GetAdminClient(ctx)
  if err != nil {
-		_ = container.Terminate(ctx)
+		cleanupCtx, cleanupCancel := context.WithTimeout(context.Background(), 30*time.Second)
+		_ = container.Terminate(cleanupCtx)
+		cleanupCancel()
 		return nil, fmt.Errorf("failed to get admin client: %w", err)
  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp/testhelpers_test.go` around lines 124 - 132, When terminating the test
container on partial startup failures (calls to container.Terminate(ctx) in the
error branches after GetServerURL and GetAdminClient), don't reuse the
possibly-expired request context; create a fresh cleanup context (e.g., via
context.WithTimeout or context.Background with a short timeout) and use that for
container.Terminate. Update the error branches around GetServerURL and
GetAdminClient to build a new cleanupCtx and call
container.Terminate(cleanupCtx) before returning the wrapped error.
api/api_test.go-170-183 (1)

170-183: ⚠️ Potential issue | 🟡 Minor

Keep the two turn indexes in sync.

UpdateTurnContent mutates m.turns, but GetTurnByPostID reads from m.turnsByPost. After an update, the same turn can return different content depending on which lookup path the test uses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_test.go` around lines 170 - 183, UpdateTurnContent currently updates
the turn content in m.turns but does not keep the parallel index m.turnsByPost
in sync, causing GetTurnByPostID to return stale content; modify
mockConversationStore.UpdateTurnContent to also locate the corresponding entry
in m.turnsByPost (using the same turn ID or post ID mapping used by
GetTurnByPostID) and update its Content field when you update m.turns so both
maps (m.turns and m.turnsByPost) remain consistent.
channels/channels_eval_test.go-145-147 (1)

145-147: ⚠️ Potential issue | 🟡 Minor

Fail fast on missing records in the eval store.

Line 145 returns (nil, nil) for a miss, and the update helpers on Lines 149-193 also succeed when the target conversation/turn does not exist. That makes the eval path more permissive than the real store and can let persistence regressions pass silently.

Also applies to: 149-160, 180-193

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@channels/channels_eval_test.go` around lines 145 - 147, The evalInMemoryStore
currently returns (nil, nil) from GetConversationByThreadAndBot which hides
missing-records; change evalInMemoryStore.GetConversationByThreadAndBot to
return a clear error when a conversation is not found (e.g., a not-found error
value or fmt.Errorf) instead of nil, nil, and update the evalInMemoryStore
update helper methods (the update helpers in the same type that modify
conversations/turns referenced around lines 149-160 and 180-193) to check for
the target conversation/turn existence first and return the same not-found error
if missing rather than succeeding silently.
api/api_conversation_test.go-23-25 (1)

23-25: ⚠️ Potential issue | 🟡 Minor

Restore the Gin globals after this test.

gin.SetMode and gin.DefaultWriter are package-level state. Leaving them mutated here can bleed into later API tests and make failures order-dependent.

🧪 Suggested cleanup
 func TestHandleGetConversation(t *testing.T) {
-	gin.SetMode(gin.ReleaseMode)
-	gin.DefaultWriter = io.Discard
+	prevMode := gin.Mode()
+	prevWriter := gin.DefaultWriter
+	t.Cleanup(func() {
+		gin.SetMode(prevMode)
+		gin.DefaultWriter = prevWriter
+	})
+
+	gin.SetMode(gin.ReleaseMode)
+	gin.DefaultWriter = io.Discard
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_conversation_test.go` around lines 23 - 25, The test
TestHandleGetConversation modifies Gin globals (gin.SetMode and
gin.DefaultWriter) and must restore them to avoid leaking state; capture the
previous values (e.g., prevMode := gin.Mode(); prevWriter := gin.DefaultWriter)
before calling gin.SetMode and reassigning gin.DefaultWriter, then use defer to
restore gin.SetMode(prevMode) and gin.DefaultWriter = prevWriter at the start of
TestHandleGetConversation so the global state is returned after the test
completes.
api/api_channel.go-173-176 (1)

173-176: ⚠️ Potential issue | 🟡 Minor

Don't swallow conversation title update failures.

Both handlers discard UpdateConversationTitle errors, so a store failure leaves the conversation untitled with no signal in logs while the request still returns success. Please at least guard on a non-empty ConversationID and log the failure. As per coding guidelines Check all errors explicitly in production code.

🛠️ Suggested pattern for both handlers
- if a.convService != nil {
- 	_ = a.convService.UpdateConversationTitle(result.ConversationID, TitleSummarizeChannel)
- }
+ if a.convService != nil && result.ConversationID != "" {
+ 	if err := a.convService.UpdateConversationTitle(result.ConversationID, TitleSummarizeChannel); err != nil {
+ 		a.pluginAPI.Log.Warn("failed to update conversation title",
+ 			"conversationID", result.ConversationID,
+ 			"error", err,
+ 		)
+ 	}
+ }

Apply the same pattern in the interval handler, using promptTitle.

Also applies to: 272-275

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_channel.go` around lines 173 - 176, The conversation title update
currently ignores errors; modify the calls to
a.convService.UpdateConversationTitle (used with result.ConversationID and
TitleSummarizeChannel, and the interval handler using promptTitle) to first
check that ConversationID is non-empty and then handle the returned error
instead of discarding it — call UpdateConversationTitle only when ConversationID
!= "" and if it returns an error log it with context (e.g., include
ConversationID and which title was attempted) so failures are visible in logs.
conversation/convert.go-40-55 (1)

40-55: ⚠️ Potential issue | 🟡 Minor

Tool result merging doesn't propagate error status to IsError field.

When merging a tool_result block into an existing ToolCall, only the Result is set. If the tool result has StatusError, this isn't reflected in the ToolCall.IsError field. This could affect downstream LLM behavior that relies on IsError to handle failed tool calls differently.

Proposed fix to propagate error status
 		case BlockTypeToolResult:
 			merged := false
+			isErr := block.Status == StatusError
 			for i := range post.ToolUse {
 				if post.ToolUse[i].ID == block.ToolUseID {
 					post.ToolUse[i].Result = block.Content
+					post.ToolUse[i].IsError = isErr
 					merged = true
 					break
 				}
 			}
 			if !merged {
 				post.ToolUse = append(post.ToolUse, llm.ToolCall{
 					ID:     block.ToolUseID,
 					Result: block.Content,
 					Status: StatusFromString(block.Status),
+					IsError: isErr,
 				})
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversation/convert.go` around lines 40 - 55, When handling
BlockTypeToolResult you only update ToolCall.Result and don't propagate the
error status into ToolCall.IsError; update the existing merged branch to set
post.ToolUse[i].Status = StatusFromString(block.Status) and set
post.ToolUse[i].IsError = (post.ToolUse[i].Status == StatusError) (or equivalent
boolean) so the IsError flag reflects the parsed status, and when appending a
new llm.ToolCall include Status: StatusFromString(block.Status) and IsError:
(StatusFromString(block.Status) == StatusError) so newly added calls also have
IsError set correctly.
search/search_conversation_test.go-408-409 (1)

408-409: ⚠️ Potential issue | 🟡 Minor

Inconsistent mock expectation signature.

Line 408 sets up ChatCompletionNoStream with only one argument matcher, but line 312 in TestSearchQueryToolsAlwaysDisabled expects two arguments. This could cause test flakiness or unexpected behavior.

Proposed fix
-	mockLLM.On("ChatCompletionNoStream", mock.Anything).
+	mockLLM.On("ChatCompletionNoStream", mock.Anything, mock.Anything).
 		Return("fallback answer", nil)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@search/search_conversation_test.go` around lines 408 - 409, The mock for
ChatCompletionNoStream is defined with a single-argument matcher but other tests
(notably TestSearchQueryToolsAlwaysDisabled) expect two arguments, causing
inconsistent expectations; update the mock setup for
mockLLM.On("ChatCompletionNoStream", ...) to use the same argument signature
everywhere (e.g., mock.Anything, mock.Anything) so the method call signature
matches across tests and returns the same ("fallback answer", nil) for the
two-argument form; locate uses of ChatCompletionNoStream and the mockLLM
declaration to apply the consistent matcher change.
conversation/service_test.go-1071-1098 (1)

1071-1098: ⚠️ Potential issue | 🟡 Minor

Test name TestSequenceNumbering_Concurrent is misleading - this tests sequential, not concurrent, behavior.

The test creates placeholder turns in a sequential for loop (lines 1085-1088). To test actual concurrency, you would need to spawn goroutines and use synchronization primitives. The current test validates sequential sequence numbering, which is still valuable, but the name should reflect that.

✏️ Suggested rename
-func TestSequenceNumbering_Concurrent(t *testing.T) {
+func TestSequenceNumbering_Sequential(t *testing.T) {

Or, if you want to actually test concurrent behavior:

func TestSequenceNumbering_Concurrent(t *testing.T) {
    svc, s := setupTestService(t)
    // ... setup ...
    
    var wg sync.WaitGroup
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            _, _ = svc.CreatePlaceholderAssistantTurn(convID, nil)
        }()
    }
    wg.Wait()
    
    // Verify sequences are unique (though not necessarily contiguous under concurrency)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversation/service_test.go` around lines 1071 - 1098, The test
TestSequenceNumbering_Concurrent is misnamed because it performs sequential
calls; either rename it to TestSequenceNumbering_Sequential (or similar) to
reflect its current behavior, or change the body to actually run concurrent
calls to CreatePlaceholderAssistantTurn by launching goroutines and
synchronizing with sync.WaitGroup, then call s.GetTurnsForConversation and
adjust assertions to check for uniqueness (and any concurrency-appropriate
properties) rather than strict contiguous ordering; reference
TestSequenceNumbering_Concurrent, CreatePlaceholderAssistantTurn, and
GetTurnsForConversation when making the change.
conversations/tool_approval.go-189-221 (1)

189-221: ⚠️ Potential issue | 🟡 Minor

Parameter acceptedToolIDs is unused in HandleToolResult.

The function signature includes acceptedToolIDs []string but it's never used. If this parameter is no longer needed in the conversation entity model (as the comment on line 190-192 suggests), consider removing it from the signature. If it's kept for future use or API consistency, document that it's intentionally unused.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/tool_approval.go` around lines 189 - 221, The parameter
acceptedToolIDs on Conversations.HandleToolResult is declared but never used;
either remove it from the function signature (and update all callers to match)
or explicitly mark it as intentionally unused to satisfy linters and reviewers
(for example by renaming to _acceptedToolIDs or adding a comment). If you remove
it, update any call sites that invoke HandleToolResult; if you keep it for API
stability, add a clear comment in the HandleToolResult signature and ensure no
build/lint warnings remain. Reference: Conversations.HandleToolResult,
acceptedToolIDs, and the call into streamToolFollowUp.
conversations/tool_approval.go-307-329 (1)

307-329: ⚠️ Potential issue | 🟡 Minor

Silent error handling in findPendingToolTurn may hide data corruption issues.

At line 316-317, if json.Unmarshal fails, the function silently continues to the next turn. While this is defensive, it could mask data corruption in stored turn content. Consider logging the error before continuing.

🛡️ Suggested improvement
 		var blocks []conversation.ContentBlock
 		if err := json.Unmarshal(turns[i].Content, &blocks); err != nil {
+			// Log but continue - corrupted turn content shouldn't block finding pending tools
+			// Consider: c.mmClient.LogWarn("Failed to unmarshal turn content", "turnID", turns[i].ID, "error", err)
 			continue
 		}

Note: This would require passing a logger to the function or making it a method on *Conversations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/tool_approval.go` around lines 307 - 329, findPendingToolTurn
currently swallows json.Unmarshal errors when decoding turns[i].Content which
can hide data corruption; modify the function to surface/log the unmarshal error
instead of silently continuing: either add a logger parameter (e.g., ctxLogger)
to findPendingToolTurn and call ctxLogger.Error / log the error when
json.Unmarshal fails (include turn index or turn.ID and the error), or convert
the function into a method on *Conversations so you can use the existing
Conversations logger; ensure you still continue to the next turn after logging
so behavior remains the same.
🧹 Nitpick comments (25)
webapp/jest.config.js (1)

4-14: Consider adding common webapp test configuration patterns.

The configuration is functional but could benefit from a few additions commonly used in webapp test suites:

  • Explicit testMatch or testRegex patterns to clarify which files are tests
  • collectCoverageFrom to define coverage collection patterns
  • Module mocks for CSS/images that React components might import
📋 Example enhanced configuration
 module.exports = {
     testEnvironment: 'jsdom',
     transform: {
         '^.+\\.tsx?$': 'ts-jest',
     },
     moduleNameMapper: {
+        '\\.(css|less|scss|sass)$': 'identity-obj-proxy',
+        '\\.(jpg|jpeg|png|gif|svg)$': '<rootDir>/tests/__mocks__/fileMock.js',
         '^@/(.*)$': '<rootDir>/src/$1',
         '^src/(.*)$': '<rootDir>/src/$1',
     },
+    testMatch: [
+        '<rootDir>/src/**/__tests__/**/*.{ts,tsx}',
+        '<rootDir>/src/**/*.{spec,test}.{ts,tsx}',
+    ],
+    collectCoverageFrom: [
+        'src/**/*.{ts,tsx}',
+        '!src/**/*.d.ts',
+        '!src/**/__tests__/**',
+    ],
     setupFilesAfterEnv: ['<rootDir>/tests/setup.tsx'],
 };

Note: For CSS mocking, you'd need to install identity-obj-proxy as a dev dependency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@webapp/jest.config.js` around lines 4 - 14, Add common webapp Jest patterns
to the existing jest.config.js by extending the current config (testEnvironment,
transform, moduleNameMapper, setupFilesAfterEnv): add an explicit testMatch (or
testRegex) to target test files, add collectCoverageFrom with include/exclude
globs for src/** to control coverage, and expand moduleNameMapper to mock CSS
and static assets (e.g., map \\.(css|scss)$ to identity-obj-proxy and
images/fonts to a file mock). Ensure you note the dependency on
identity-obj-proxy for CSS mapping and create a simple file mock (e.g.,
tests/fileMock.js) referenced by moduleNameMapper.
conversations/web_search_context.go (1)

86-95: Inconsistent error handling for allowed URLs vs context values.

The error handling differs between WebSearchContextKey (lines 64-83, returns nil on any error) and WebSearchAllowedURLsKey (lines 86-95, silently continues on error). This asymmetry could mask deserialization bugs for allowed URLs.

If allowed URLs are optional and errors should be tolerated, consider adding a debug log for consistency:

Suggested improvement for error visibility
 	// Reconstruct allowed URLs
 	if raw, ok := params[mmtools.WebSearchAllowedURLsKey]; ok {
 		urlBytes, marshalErr := json.Marshal(raw)
-		if marshalErr == nil {
+		if marshalErr != nil {
+			c.mmClient.LogDebug("Failed to re-marshal allowed URLs", "error", marshalErr, "post_id", postID)
+		} else {
 			var allowedURLs []string
-			if unmarshalErr := json.Unmarshal(urlBytes, &allowedURLs); unmarshalErr == nil {
+			if unmarshalErr := json.Unmarshal(urlBytes, &allowedURLs); unmarshalErr != nil {
+				c.mmClient.LogDebug("Failed to unmarshal allowed URLs", "error", unmarshalErr, "post_id", postID)
+			} else {
 				params[mmtools.WebSearchAllowedURLsKey] = allowedURLs
 				c.mmClient.LogDebug("Reconstructed allowed URLs", "post_id", postID, "num_urls", len(allowedURLs))
 			}
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/web_search_context.go` around lines 86 - 95, The
WebSearchAllowedURLsKey branch currently swallows marshal/unmarshal errors while
WebSearchContextKey returns nil on errors; make handling consistent by logging
debug messages and setting params[mmtools.WebSearchAllowedURLsKey] = nil when
json.Marshal or json.Unmarshal fail. Specifically, in the block that handles
WebSearchAllowedURLsKey, add c.mmClient.LogDebug(...) calls for marshalErr and
unmarshalErr (mirroring the WebSearchContextKey logs) and ensure you explicitly
set the params entry to nil on error so deserialization failures are visible and
treated the same as the WebSearchContextKey path.
streaming/post_modifier.go (1)

16-26: Remove unused requesterUserID parameter from ModifyPostForBot.

The requesterUserID parameter is not used within the function body. All 6 callers would need to be updated: streaming/streaming.go (2 calls), search/search.go, meetings/meeting_summarization.go, conversations/handle_messages.go, and conversations/conversations.go.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@streaming/post_modifier.go` around lines 16 - 26, The parameter
requesterUserID on ModifyPostForBot is unused; remove it from the function
signature (ModifyPostForBot(botid string, post *model.Post, respondingToPostID
string)) and update all callers to stop passing that argument:
streaming/streaming.go (two calls), search/search.go,
meetings/meeting_summarization.go, conversations/handle_messages.go, and
conversations/conversations.go; ensure you only change the function signature
and call sites (keep post.UserId, Type, UnsafeLinksPostProp and RespondingToProp
logic intact) and run a build to fix any remaining references.
store/store_test.go (1)

96-104: Exercise the destructive upgrade path in this test.

These assertions only verify the fresh-install end state. They won't catch losing existing LLM_PostMeta rows during an upgrade, which is the risky path for 000006. Seed a pre-000006 schema with legacy data, run RunMigrations, and assert the replacement conversation records survive.

Also applies to: 116-134

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store/store_test.go` around lines 96 - 104, The test currently only verifies
a fresh-install end state; modify it to exercise the destructive upgrade path by
seeding a pre-000006 schema and data for the legacy LLM_PostMeta table, running
RunMigrations, and asserting the migration preserves/creates the expected
conversation records. Concretely: before calling RunMigrations, use s.db to
create the legacy llm_postmeta table and insert representative rows tied to a
conversation, then call RunMigrations(ctx, s.db) (or the test helper that runs
migrations), and finally query the new conversation/conversation meta records to
assert the original data was migrated and not lost; repeat the same pattern for
the other assertion block around lines 116-134.
search/search_test.go (1)

607-614: Cover the new conversation dependency in a happy-path search test.

This SearchQuery path still passes nil for the new conversation dependency, so the response assertions can stay green without ever checking that the migrated search flow actually writes or continues a conversation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@search/search_test.go` around lines 607 - 614, The test constructs the Search
engine with New(..., nil, nil, nil) and then exercises SearchQuery without
providing the new conversation dependency, so it never asserts that the migrated
flow writes/continues a conversation; update the happy-path test to pass a mock
or in-memory conversation service instead of nil (create or reuse a stub that
records calls) when calling New (the constructor used in the diff) and then
assert that SearchQuery invoked the expected conversation methods on that mock
(e.g., started/continued conversation calls), while keeping existing response
assertions using mockEmbedding, mockClient, and promptsObj.
store/migrations/000004_create_conversations_table.up.sql (1)

1-16: Add an index for the per-user summary/list path.

This schema only indexes the thread/bot lookup. If conversation summaries are read by UserID and recency, the new RHS/list path will full-scan and sort LLM_Conversations as histories grow.

Possible supporting index
+CREATE INDEX IF NOT EXISTS idx_llm_conversations_user_updated
+    ON LLM_Conversations(UserID, DeleteAt, UpdatedAt DESC);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store/migrations/000004_create_conversations_table.up.sql` around lines 1 -
16, The migration is missing an index for querying conversation summaries by
user and recency; add a non-unique index on LLM_Conversations for the per-user
list path (e.g., create index idx_llm_conversations_user_updatedat on
LLM_Conversations(UserID, UpdatedAt) — or UserID, UpdatedAt DESC if your SQL
dialect supports descending order) so reads by UserID ordered by UpdatedAt avoid
full-table scans; update the migration to include this new index alongside the
existing idx_llm_conversations_thread_bot.
conversations/conversations_test.go (1)

123-134: Remove the discarded setup.

The conversations.New(...) result on Lines 123-134 and the first bots.NewBot(...) on Line 156 are never used, so they are just noise now.

Also applies to: 156-156

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/conversations_test.go` around lines 123 - 134, Remove the
unused setup calls that instantiate but never use objects: delete the
conversations.New(...) call and the first bots.NewBot(...) invocation so no
discarded values are created; locate the conversations.New(...) call and the
bots.NewBot(...) call in the test and remove those statements (or assign to a
meaningful variable and use it if intended) to eliminate the dead/noise setup.
streaming/test_helpers_test.go (1)

25-31: Snapshot published event data before recording it.

payload and broadcast are stored by reference here. If the code under test mutates either after publishing, the recorded event history changes underneath the assertions.

Possible fix
 func (c *fakeStreamingClient) PublishWebSocketEvent(event string, payload map[string]interface{}, broadcast *model.WebsocketBroadcast) {
+	payloadCopy := make(map[string]interface{}, len(payload))
+	for k, v := range payload {
+		payloadCopy[k] = v
+	}
+
+	var broadcastCopy *model.WebsocketBroadcast
+	if broadcast != nil {
+		bc := *broadcast
+		broadcastCopy = &bc
+	}
+
 	c.events = append(c.events, publishedEvent{
 		event:     event,
-		payload:   payload,
-		broadcast: broadcast,
+		payload:   payloadCopy,
+		broadcast: broadcastCopy,
 	})
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@streaming/test_helpers_test.go` around lines 25 - 31, The test helper
PublishWebSocketEvent on fakeStreamingClient appends payload and broadcast by
reference, so subsequent mutations to payload or broadcast will change recorded
events; modify PublishWebSocketEvent to snapshot the data before storing by
creating a shallow copy of the payload map (copy each key/value into a new map)
and a copied value of the broadcast (dereference and copy the struct into a new
model.WebsocketBroadcast value or deep-copy fields as needed) and append those
copies to the events slice (referencing publishedEvent, events,
PublishWebSocketEvent, and model.WebsocketBroadcast).
store/turns.go (1)

73-88: Update operations silently succeed when turn ID doesn't exist.

UpdateTurnContent and UpdateTurnTokens don't verify that any rows were affected. If an invalid turn ID is passed, the operation silently succeeds without error. Consider checking result.RowsAffected() if updates to non-existent turns should be treated as errors.

Example: Checking affected rows
 func (s *Store) UpdateTurnContent(id string, content json.RawMessage) error {
 	query, args, err := s.builder.
 		Update("LLM_Turns").
 		Set("Content", content).
 		Where(sq.Eq{"ID": id}).
 		ToSql()
 	if err != nil {
 		return fmt.Errorf("failed to build update turn content query: %w", err)
 	}
-	_, err = s.db.Exec(query, args...)
+	result, err := s.db.Exec(query, args...)
 	if err != nil {
 		return fmt.Errorf("failed to update turn content: %w", err)
 	}
+	affected, err := result.RowsAffected()
+	if err != nil {
+		return fmt.Errorf("failed to check affected rows: %w", err)
+	}
+	if affected == 0 {
+		return fmt.Errorf("turn not found: %s", id)
+	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store/turns.go` around lines 73 - 88, UpdateTurnContent (and similarly
UpdateTurnTokens) currently executes the UPDATE but doesn't check whether any
rows were affected, so passing a non-existent turn ID silently succeeds; modify
UpdateTurnContent to capture the sql.Result from s.db.Exec(query, args...), call
result.RowsAffected(), and return a clear error (e.g., "no turn found with id
%s") when RowsAffected() == 0; apply the same change to UpdateTurnTokens to
ensure both functions surface a not-found condition instead of succeeding
silently.
channels/analysis_conversation_test.go (1)

79-87: In-memory store GetTurnsForConversation doesn't preserve sequence ordering.

The GetTurnsForConversation method returns turns in insertion order (order of turnsByConv slice) rather than by Sequence field. While tests may pass due to insertion order matching sequence order, this differs from the real store's ORDER BY Sequence ASC behavior.

Proposed fix to sort by sequence
+import "sort"
+
 func (s *inMemoryStore) GetTurnsForConversation(conversationID string) ([]store.Turn, error) {
 	ids := s.turnsByConv[conversationID]
 	turns := make([]store.Turn, 0, len(ids))
 	for _, id := range ids {
 		if t, ok := s.turns[id]; ok {
 			turns = append(turns, *t)
 		}
 	}
+	sort.Slice(turns, func(i, j int) bool {
+		return turns[i].Sequence < turns[j].Sequence
+	})
 	return turns, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@channels/analysis_conversation_test.go` around lines 79 - 87,
GetTurnsForConversation in inMemoryStore collects turns from turnsByConv but
returns them in insertion order instead of ordering by the Turn.Sequence field
like the real store; update the method (GetTurnsForConversation) to sort the
collected []store.Turn by each turn.Sequence in ascending order before returning
(use the built-in sort with a comparator that compares turn.Sequence),
referencing the store.Turn type and the inMemoryStore's turns map/turnsByConv
slice to locate the code to change.
api/api_post.go (1)

308-316: Error matching via string comparison is fragile.

The error handling at line 311 uses string comparison (err.Error() == "no pending tool calls found...") which is fragile and could break if the error message changes. Consider using sentinel errors or error type assertions instead.

Proposed approach using sentinel errors

Define sentinel errors in the conversations package:

// In conversations package
var (
    ErrNoPendingToolCalls = errors.New("no pending tool calls found in conversation")
    ErrMissingConversationID = errors.New("post missing conversation_id")
)

Then use errors.Is for checking:

 	err := a.conversationsService.HandleToolCall(userID, post, channel, data.AcceptedToolIDs)
 	if err != nil {
 		switch {
-		case err.Error() == "no pending tool calls found in conversation" || err.Error() == "post missing conversation_id":
+		case errors.Is(err, conversations.ErrNoPendingToolCalls) || errors.Is(err, conversations.ErrMissingConversationID):
 			c.AbortWithError(http.StatusBadRequest, err)
 		default:
 			c.AbortWithError(http.StatusInternalServerError, err)
 		}
 		return
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_post.go` around lines 308 - 316, Replace fragile string comparisons
in the error handling after a.conversationsService.HandleToolCall by introducing
sentinel errors in the conversations package (e.g., ErrNoPendingToolCalls and
ErrMissingConversationID) and then check returned errors using errors.Is (or
type assertions) instead of err.Error() == "..."; update the callsite to import
the standard "errors" package and use errors.Is(err,
conversations.ErrNoPendingToolCalls) and errors.Is(err,
conversations.ErrMissingConversationID) to decide whether to return
http.StatusBadRequest or http.StatusInternalServerError.
search/search.go (1)

465-474: Token counts are hardcoded to zero for non-streaming search.

FinalizeAssistantTurn is called with 0, 0 for token counts. Unlike the streaming path which receives token usage events, the non-streaming ChatCompletionNoStream doesn't return usage data. If token tracking is important for search conversations, consider using a method that returns usage or accepting this limitation.

toolrunner/toolrunner_test.go (1)

262-348: Consider consolidating similar test cases into table-driven format.

The partial approval and mixed batch tests (TestToolRunner_PartialApproval_NoneExecuted, TestToolRunner_MixedBatch_AllOrNothing) test similar approval logic patterns and could be consolidated into a single table-driven test for better maintainability. Based on learnings: "Write Go unit tests as table-driven tests whenever possible."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@toolrunner/toolrunner_test.go` around lines 262 - 348, Two tests duplicate
approval logic: TestToolRunner_PartialApproval_NoneExecuted and
TestToolRunner_MixedBatch_AllOrNothing; consolidate them into a single
table-driven test (e.g., TestToolRunner_ApprovalCases) that defines cases with a
name, testLLM response, shouldExecute func, and expected outcomes (presence of
text/tool call events, ToolTurns empty/non-empty, inner.callCount). For each
case create a fresh testLLM and runner := New(inner), call runner.Run(request,
tc.shouldExecute) inside t.Run(case.name, ...), then assert the stream events
and result.ToolTurns and inner.callCount against the case expectations; this
removes duplicate setup/iteration logic while keeping assertions per case.
Ensure you reference and reuse existing identifiers: testLLM, New, runner.Run,
llm.CompletionRequest, and the shouldExecute closure.
conversations/channel_mention_test.go (1)

30-43: Context cancel timing could cause issues.

The cancel() is called on line 39 before container.ConnectionString(...) is called on line 45. If the container startup takes close to 2 minutes, canceling the context immediately after tcpostgres.Run returns could potentially affect subsequent container operations that reuse the context internally.

Consider deferring the cancel or using a fresh context for ConnectionString:

Suggested approach
 func TestMain(m *testing.M) {
 	ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
+	defer cancel()
 	container, err := tcpostgres.Run(ctx,
 		"postgres:16-alpine",
 		tcpostgres.WithDatabase("testdb"),
 		tcpostgres.WithUsername("testuser"),
 		tcpostgres.WithPassword("testpass"),
 		tcpostgres.BasicWaitStrategies(),
 	)
-	cancel()
 	if err != nil {
 		fmt.Printf("Failed to start postgres container: %v\n", err)
 		os.Exit(1)
 	}

-	channelMentionTestConnStr, err = container.ConnectionString(context.Background(), "sslmode=disable")
+	channelMentionTestConnStr, err = container.ConnectionString(ctx, "sslmode=disable")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/channel_mention_test.go` around lines 30 - 43, The TestMain
function cancels the context (cancel()) immediately after tcpostgres.Run returns
which can prematurely invalidate the context for subsequent container operations
such as container.ConnectionString; move the cancel to a defer (defer cancel())
after creating the context or create a fresh context when calling
container.ConnectionString to ensure the context used by tcpostgres.Run and
later calls remains valid; update TestMain to use defer cancel() or a new
context prior to calling container.ConnectionString and other container methods
referencing the context.
streaming/streaming.go (1)

326-329: Edge case: nextSeq starts at 0 for empty turns.

When len(turns) == 0, nextSeq is 0. Verify this is the intended starting sequence (some systems use 1-based sequences).

Consider starting at 1
 	nextSeq := 0
 	if len(turns) > 0 {
 		nextSeq = turns[len(turns)-1].Sequence + 1
+	} else {
+		nextSeq = 1
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@streaming/streaming.go` around lines 326 - 329, The code sets nextSeq to 0
when turns is empty which may be unintended if sequences are 1-based; in the
block that computes nextSeq from turns (the variable nextSeq and the turns
slice, using turns[len(turns)-1].Sequence), decide the canonical starting
sequence and update the initialization accordingly (e.g., set nextSeq := 1 for a
1-based sequence) or introduce a named constant like sequenceStart and use that
both for the empty-case initializer and any sequence arithmetic to ensure
consistent behavior across functions that consume nextSeq.
store/conversations.go (1)

122-137: Update operations don't verify rows were affected.

UpdateConversationTitle and UpdateConversationRootPostID execute updates without checking if any rows were actually modified. If called with a non-existent ID, they silently succeed with no rows affected.

Consider returning ErrConversationNotFound when no rows are updated:

Proposed fix
 func (s *Store) UpdateConversationTitle(id, title string) error {
 	query, args, err := s.builder.
 		Update("LLM_Conversations").
 		Set("Title", title).
 		Set("UpdatedAt", model.GetMillis()).
 		Where(sq.Eq{"ID": id}).
 		ToSql()
 	if err != nil {
 		return fmt.Errorf("failed to build update title query: %w", err)
 	}
-	_, err = s.db.Exec(query, args...)
+	result, err := s.db.Exec(query, args...)
 	if err != nil {
 		return fmt.Errorf("failed to update conversation title: %w", err)
 	}
+	rowsAffected, err := result.RowsAffected()
+	if err != nil {
+		return fmt.Errorf("failed to get rows affected: %w", err)
+	}
+	if rowsAffected == 0 {
+		return ErrConversationNotFound
+	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store/conversations.go` around lines 122 - 137, UpdateConversationTitle (and
similarly UpdateConversationRootPostID) performs the UPDATE but never checks
whether any rows were affected, so calls with a non-existent ID silently
succeed; change both functions to capture the sql.Result from s.db.Exec, call
RowsAffected(), and if rows == 0 return ErrConversationNotFound (propagate other
Exec errors as before); reference the functions UpdateConversationTitle and
UpdateConversationRootPostID and the sentinel ErrConversationNotFound when
making this change.
threads/analysis_conversation_test.go (2)

316-318: Stream result intentionally discarded.

The _, _ = result.Stream.ReadAll() explicitly discards both the content and potential error. This is acceptable in test code where the stream content isn't being validated, but consider adding a brief comment explaining the intent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@threads/analysis_conversation_test.go` around lines 316 - 318, The call to
result.Stream.ReadAll() currently discards both data and error (_, _ =
result.Stream.ReadAll()) which is fine in this test but lacks intent; update the
test around the ReadAll call (result.Stream.ReadAll()) to include a short
clarifying comment explaining that both the stream content and any error are
intentionally ignored in this test (e.g., "intentionally discarding stream data
and error because this test only exercises X"), so future readers understand
this is deliberate.

19-20: Test file uses mocking library contrary to coding guidelines.

Similar to search/search_conversation_test.go, this file uses testify/mock for LLM and mmapi interfaces, which conflicts with the coding guideline: "Never use mocking or introduce new testing libraries in Go tests."

Consider implementing fake implementations like the testBotLookup already in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@threads/analysis_conversation_test.go` around lines 19 - 20, The test imports
testify-based mocks (mocks, mmapimocks) which violates the no-mocking guideline;
replace uses of those testify/mock types for the LLM and mmapi interfaces with
hand-written fake implementations (following the pattern of the existing
testBotLookup fake) and update the tests in analysis_conversation_test.go to
instantiate and assert against those fakes instead of mocks from packages
"github.com/mattermost/mattermost-plugin-agents/llm/mocks" and
"github.com/mattermost/mattermost-plugin-agents/mmapi/mocks"; ensure the fakes
implement the same interfaces used by the code under test and provide hooks to
control returned values/record calls so assertions in the tests can be
translated from mock.Expect/Assert calls to simple field checks or methods on
the fakes.
search/search_conversation_test.go (1)

22-23: Test file uses mocking library contrary to coding guidelines.

The coding guidelines state: "Never use mocking or introduce new testing libraries in Go tests." This file uses testify/mock for MockEmbeddingSearch, MockClient, and MockLanguageModel.

Consider refactoring to use in-memory fakes (like fakeConversationStore already in this file) instead of mocks, or confirm with the team if this guideline applies to these external service interfaces.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@search/search_conversation_test.go` around lines 22 - 23, The test imports
and uses testify/mock types (MockEmbeddingSearch, MockClient, MockLanguageModel)
which violates the "no mocking" guideline; replace those mocked types with
lightweight in-memory fakes or the existing fakeConversationStore pattern used
in this file: implement simple fakeEmbeddingSearch, fakeClient, and
fakeLanguageModel that satisfy the same interfaces used by the tests, update
tests to instantiate and inject these fakes instead of
MockEmbeddingSearch/MockClient/MockLanguageModel, and remove the testify/mock
import and any mock-specific expectations/assertions.
conversations/handle_messages.go (1)

241-245: Tool turn write failures are logged but not returned.

The error from WriteToolTurns is logged but not propagated. This means the conversation will continue even if tool turn persistence fails. Ensure this graceful degradation is intentional and that the LLM won't re-execute tools in subsequent turns due to missing history.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/handle_messages.go` around lines 241 - 245, The code currently
logs WriteToolTurns failures but continues processing, which can cause missing
tool-history and potential re-execution; change the behavior in the block that
calls c.convService.WriteToolTurns (using convResult.Conversation.ID and
result.ToolTurns) so that on writeErr you both log and propagate the error back
to the caller (wrap with context like "failed to write tool turns" and return
it) or alternatively set a persisted flag in the conversation state to prevent
LLM re-executing tools if persistence truly must fail; do not silently swallow
writeErr from WriteToolTurns—either return the error from this handler method or
mark the turns as saved/unavailable to ensure correct subsequent behavior.
conversations/conversations.go (1)

214-218: WriteToolTurns error handling differs from handle_messages.go.

Here, WriteToolTurns failure returns an error (line 216), but in handle_messages.go line 242-244, it only logs the error. Consider making error handling consistent across both paths - either both should fail hard or both should degrade gracefully.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/conversations.go` around lines 214 - 218, The WriteToolTurns
error handling in conversations.go is inconsistent with handle_messages.go:
instead of returning an error after calling c.convService.WriteToolTurns, change
the block that calls WriteToolTurns to match the other path by logging the error
and continuing (i.e., do not return an error), or alternatively make both paths
return; pick one approach and apply it consistently. Locate the call to
c.convService.WriteToolTurns (the block checking len(runResult.ToolTurns) > 0)
and replace the return fmt.Errorf behavior with the same logging-only behavior
used in handle_messages.go (or change the handle_messages.go logging to return
an error) so both code paths behave identically. Ensure you use the same logger
and log level/mechanism as handle_messages.go when logging the WriteToolTurns
failure.
conversations/tool_approval.go (2)

233-246: Duplicate user preference loading logic with HandleToolCall (lines 71-79).

The user-disabled-provider filtering code is nearly identical in both places. Consider extracting a helper method like applyUserToolPreferences(llmContext *llm.Context, user *model.User, channel *model.Channel).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/tool_approval.go` around lines 233 - 246, Extract the
duplicated user-disabled-provider filtering into a single helper function (e.g.,
applyUserToolPreferences) and call it from both locations (the current function
around BuildLLMContextUserRequest and HandleToolCall). The helper should accept
the llm.Context (or pointer), the user (*model.User) and the channel
(*model.Channel), perform the isDM/group check, call
mcp.LoadUserPreferences(c.mmClient, user.Id), log warnings using
c.mmClient.LogWarn when prefsErr != nil, and if prefs.DisabledServers is
non-empty and llmContext.Tools != nil invoke
llmContext.Tools.RemoveToolsByServerOrigin(prefs.DisabledServers); replace the
duplicated code blocks with a call to this new helper.

81-123: Consider validating that acceptedToolIDs correspond to actual pending tool calls.

If acceptedToolIDs contains IDs that don't exist in pendingBlocks, they are silently ignored. This is likely fine (the UI should only send valid IDs), but defensive validation could help catch bugs in the frontend.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/tool_approval.go` around lines 81 - 123, acceptedToolIDs may
contain IDs that don't match any pendingBlocks; add defensive validation by
building a set/map of pending tool IDs from pendingBlocks (using block.ID) and
then iterate acceptedToolIDs to check membership before attempting resolution:
for any accepted ID not present, append a toolrunner.ToolResult with ToolCallID
set to that ID, Name empty or "unknown", Result set to a clear error like
"Accepted tool ID not found in pendingBlocks", IsError true, and optionally log
a warning; only call llmContext.Tools.ResolveTool for IDs that exist in the
pendingBlocks map and update the corresponding block.Status and toolResults as
currently done.
threads/threads.go (1)

79-112: Same redundant GetConversation pattern noted in channels.go applies here.

Lines 80-95 create a conversation then immediately fetch it. Consider the refactor suggested for channels.go to return the conversation directly from CreateConversation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@threads/threads.go` around lines 79 - 112, The code redundantly calls
GetConversation immediately after CreateConversation; instead, change
CreateConversation (or its return type) to return the created conversation data
directly so callers can avoid the extra fetch, then use that returned
conversation with convService.BuildCompletionRequest (replace the
GetConversation call in this function). Update any call sites (threads.go and
channels.go) to accept the new CreateConversation return (e.g., a Conversation
or ConversationID+Conversation) and pass that object to BuildCompletionRequest,
keeping request.OperationSubType assignment and subsequent llm.ChatCompletion
usage unchanged.
channels/channels.go (1)

128-149: Consider returning the conversation directly from CreateConversation to avoid the extra lookup.

CreateConversation is called at line 128, then GetConversation is immediately called at line 141 with the returned ID. This pattern adds an unnecessary database round-trip. The same pattern appears in IntervalWithRequest (lines 239-255).

♻️ Suggested approach

Consider having CreateConversation return the full *store.Conversation object (or add it to CreateConversationResult), so callers don't need to fetch it again immediately after creation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@channels/channels.go` around lines 128 - 149, CreateConversation currently
returns only an ID causing an extra GetConversation lookup; change the convSvc
CreateConversation signature (or its CreateConversationResult) to include and
return the full *store.Conversation (or make CreateConversation return
*store.Conversation, error), then in channels.go replace the convResult +
GetConversation(convResult.ConversationID) sequence by using the returned
conversation directly when calling BuildCompletionRequest; apply the same
change/update to the IntervalWithRequest call path so both callers use the
returned conversation from CreateConversation and remove the redundant
GetConversation call, and update all convSvc implementations to populate and
return the conversation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: a7e71122-4c37-417e-a917-debb877e168d

📥 Commits

Reviewing files that changed from the base of the PR and between e6d75dc and c10461e.

⛔ Files ignored due to path filters (1)
  • webapp/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (89)
  • api/api.go
  • api/api_admin_test.go
  • api/api_channel.go
  • api/api_conversation.go
  • api/api_conversation_test.go
  • api/api_no_tools_test.go
  • api/api_post.go
  • api/api_search_test.go
  • api/api_test.go
  • bots/bots.go
  • channels/analysis_conversation_test.go
  • channels/channels.go
  • channels/channels_eval_test.go
  • conversation/content_block.go
  • conversation/content_block_test.go
  • conversation/convert.go
  • conversation/convert_test.go
  • conversation/helpers.go
  • conversation/service.go
  • conversation/service_test.go
  • conversations/channel_mention_test.go
  • conversations/completion.go
  • conversations/conversations.go
  • conversations/conversations_test.go
  • conversations/direct_message_eval_test.go
  • conversations/dm_conversation_test.go
  • conversations/handle_messages.go
  • conversations/handle_messages_placeholder_test.go
  • conversations/mcp_auto_approval.go
  • conversations/mcp_auto_approval_test.go
  • conversations/regeneration.go
  • conversations/store.go
  • conversations/test_helpers_test.go
  • conversations/tool_approval.go
  • conversations/tool_handling.go
  • conversations/tool_handling_test.go
  • conversations/tool_permissions.go
  • conversations/web_search_context.go
  • e2e/tests/semantic-search/search-sources.spec.ts
  • llm/auto_run_tools.go
  • llm/auto_run_tools_test.go
  • llm/language_model.go
  • llm/tools.go
  • llm/tools_test.go
  • mcp/testhelpers_test.go
  • mcp/tool_policy.go
  • mcpserver/eval_helpers_test.go
  • mcpserver/tools_eval_test.go
  • meetings/transcriptions.go
  • mmtools/provider_test.go
  • postgres/pgvector_test.go
  • search/search.go
  • search/search_citation_eval_test.go
  • search/search_conversation_test.go
  • search/search_test.go
  • server/main.go
  • store/conversations.go
  • store/conversations_test.go
  • store/migrations/000004_create_conversations_table.down.sql
  • store/migrations/000004_create_conversations_table.up.sql
  • store/migrations/000005_create_turns_table.down.sql
  • store/migrations/000005_create_turns_table.up.sql
  • store/migrations/000006_drop_post_meta_table.down.sql
  • store/migrations/000006_drop_post_meta_table.up.sql
  • store/store_test.go
  • store/turns.go
  • store/turns_test.go
  • streaming/auto_approval_test.go
  • streaming/post_modifier.go
  • streaming/streaming.go
  • streaming/test_helpers_test.go
  • streaming/tool_call_redaction_test.go
  • streaming/turn_persistence_test.go
  • threads/analysis_conversation_test.go
  • threads/threads.go
  • threads/threads_test.go
  • toolrunner/toolrunner.go
  • toolrunner/toolrunner_test.go
  • webapp/jest.config.js
  • webapp/package.json
  • webapp/src/client.tsx
  • webapp/src/components/llmbot_post/llmbot_post.tsx
  • webapp/src/components/llmbot_post/turn_content_utils.test.ts
  • webapp/src/components/llmbot_post/turn_content_utils.ts
  • webapp/src/components/rhs/rhs.tsx
  • webapp/src/hooks/use_conversation.test.ts
  • webapp/src/hooks/use_conversation.ts
  • webapp/src/index.tsx
  • webapp/src/types/conversation.ts
💤 Files with no reviewable changes (14)
  • conversations/completion.go
  • llm/tools_test.go
  • llm/language_model.go
  • conversations/mcp_auto_approval_test.go
  • conversations/handle_messages_placeholder_test.go
  • conversations/mcp_auto_approval.go
  • conversations/tool_permissions.go
  • streaming/tool_call_redaction_test.go
  • llm/auto_run_tools.go
  • llm/tools.go
  • llm/auto_run_tools_test.go
  • conversations/tool_handling_test.go
  • conversations/tool_handling.go
  • streaming/auto_approval_test.go

Comment thread conversation/helpers.go
Comment thread conversation/service.go
Comment thread conversations/conversations_test.go
Comment thread conversations/direct_message_eval_test.go
Comment thread conversations/regeneration.go Outdated
Comment thread store/migrations/000006_create_turns_table.up.sql Outdated
Comment thread store/migrations/000007_drop_post_meta_table.down.sql Outdated
Comment thread store/migrations/000006_drop_post_meta_table.up.sql Outdated
Comment thread toolrunner/toolrunner.go
Comment thread toolrunner/toolrunner.go Outdated
@crspeller
Copy link
Copy Markdown
Member Author

@coderabbitai resolve

# Conflicts:
#	conversations/conversations.go
#	conversations/conversations_test.go
#	conversations/direct_message_eval_test.go
#	conversations/handle_messages.go
#	conversations/mcp_auto_approval.go
#	conversations/mcp_auto_approval_test.go
#	conversations/regeneration.go
#	conversations/tool_handling.go
#	conversations/tool_permissions.go
#	streaming/post_modifier.go
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
conversations/handle_messages.go (2)

279-288: Consider adding panic recovery to the goroutine.

If GenerateTitle panics, it could affect plugin stability since goroutine panics propagate. Adding a recover wrapper is a defensive practice for background work.

Also, minor: Line 286 uses genErr.Error() while line 270 passes the error directly. Consider consistency with "error", genErr.

🛡️ Suggested fix with panic recovery
 go func() {
+	defer func() {
+		if r := recover(); r != nil {
+			c.mmClient.LogError("Panic in title generation", "error", r)
+		}
+	}()
 	if genErr := c.convService.GenerateTitle(
 		convResult.Conversation.ID,
 		bot.LLM(),
 		post.Message,
 		llmContext,
 	); genErr != nil {
-		c.mmClient.LogError("Failed to generate title", "error", genErr.Error())
+		c.mmClient.LogError("Failed to generate title", "error", genErr)
 	}
 }()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/handle_messages.go` around lines 279 - 288, Wrap the anonymous
goroutine calling c.convService.GenerateTitle(...) with a defer/recover to catch
and log panics so background panics don't crash the process; inside the
goroutine use defer func(){ if r := recover(); r != nil {
c.mmClient.LogError("panic generating title", "panic", fmt.Sprintf("%v", r))
}}() before invoking c.convService.GenerateTitle, and also make the error
logging consistent by passing the error value (e.g. "error", genErr) to
c.mmClient.LogError instead of genErr.Error(); ensure you import fmt if needed
and keep references to GenerateTitle, convService, and mmClient unchanged.

372-378: Same goroutine pattern as handleMentionViaConversation.

Consider adding panic recovery here as well for consistency and robustness.

🛡️ Suggested fix
 if dmResult.IsNew {
 	go func() {
+		defer func() {
+			if r := recover(); r != nil {
+				c.mmClient.LogError("Panic in title generation", "error", r)
+			}
+		}()
 		if titleErr := c.convService.GenerateTitle(dmResult.ConversationID, bot.LLM(), post.Message, llmContext); titleErr != nil {
-			c.mmClient.LogError("Failed to generate title", "error", titleErr.Error())
+			c.mmClient.LogError("Failed to generate title", "error", titleErr)
 		}
 	}()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/handle_messages.go` around lines 372 - 378, The goroutine
launched when dmResult.IsNew that calls c.convService.GenerateTitle should
mirror handleMentionViaConversation by adding panic recovery: wrap the anonymous
function body with a defer that recovers from panics and logs them (use
c.mmClient.LogError or similar) so any panic inside GenerateTitle or its context
doesn't crash the process; keep the existing error logging for titleErr and
ensure the recover path provides context like "panic while generating title"
along with the recovered value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@conversations/handle_messages.go`:
- Around line 279-288: Wrap the anonymous goroutine calling
c.convService.GenerateTitle(...) with a defer/recover to catch and log panics so
background panics don't crash the process; inside the goroutine use defer
func(){ if r := recover(); r != nil { c.mmClient.LogError("panic generating
title", "panic", fmt.Sprintf("%v", r)) }}() before invoking
c.convService.GenerateTitle, and also make the error logging consistent by
passing the error value (e.g. "error", genErr) to c.mmClient.LogError instead of
genErr.Error(); ensure you import fmt if needed and keep references to
GenerateTitle, convService, and mmClient unchanged.
- Around line 372-378: The goroutine launched when dmResult.IsNew that calls
c.convService.GenerateTitle should mirror handleMentionViaConversation by adding
panic recovery: wrap the anonymous function body with a defer that recovers from
panics and logs them (use c.mmClient.LogError or similar) so any panic inside
GenerateTitle or its context doesn't crash the process; keep the existing error
logging for titleErr and ensure the recover path provides context like "panic
while generating title" along with the recovered value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 3b8d127d-47d1-4263-a811-71f92ba7bc09

📥 Commits

Reviewing files that changed from the base of the PR and between 5fe0d2d and bccfe75.

📒 Files selected for processing (3)
  • conversations/bot_channel_tool_filter.go
  • conversations/handle_messages.go
  • llm/tools.go
💤 Files with no reviewable changes (1)
  • llm/tools.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
api/api_test.go (1)

260-267: ⚠️ Potential issue | 🟠 Major

Configure ownership resolution path consistently in test setup.

Line 261 creates a non-nil conversationsService, and Line 266 passes it into New(...). In isConversationOwner, a non-nil conversation service is checked first; on error it returns false without falling back to conversationStore. That can invalidate the ownership path you seed later in tests.

Suggested fix
-	// Create minimal conversations service for testing
-	conversationsService := &conversations.Conversations{}
+	// Keep nil so API ownership checks use the injected conversationStore in tests.
+	var conversationsService *conversations.Conversations
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_test.go` around lines 260 - 267, The test creates a non-nil
conversationsService and passes it into New(...), which causes
isConversationOwner to use the conversationsService path and bypass the seeded
mock conversationStore; change the setup so ownership resolution uses the mock
store by passing nil for conversationsService (i.e., set conversationsService =
nil or stop passing it into New) or alternatively seed the conversationsService
with a test implementation that delegates to mockConvStore; update the variable
used in the api construction (conversationsService in the New(...) call) so it
consistently resolves ownership via mockConvStore during the test.
🧹 Nitpick comments (1)
api/api_test.go (1)

159-168: Return explicit not-found errors for missing turns in the test store.

Line 165 (GetTurnByPostID) and Line 182 (UpdateTurnContent) currently return success-like values when the turn is missing. That can mask handler error-path regressions and make tests less trustworthy.

Also applies to: 170-183

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/api_test.go` around lines 159 - 168, The mockConversationStore methods
GetTurnByPostID and UpdateTurnContent currently return nil,nil when a turn is
missing which masks not-found cases; change both methods to return a clear
not-found error (e.g., errors.New("turn not found") or a package-level sentinel
like ErrTurnNotFound) when the postID/key is absent, preserve existing behavior
of returning m.err when set, and update any tests expecting nil results to
assert the explicit not-found error instead; ensure references to
mockConversationStore, GetTurnByPostID, UpdateTurnContent, and the chosen
ErrTurnNotFound symbol are used so callers can distinguish missing-turn errors
from success.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/api_test.go`:
- Around line 121-127: The new test-defined mockConversationStore violates the
repo rule against introducing mocks in *_test.go; replace it with a
deterministic in-memory fixture type (e.g., conversationFixture or
inMemoryConversationStore) that implements the same ConversationStore methods
but without "mock" semantics: remove the generic err field and any behavior that
simulates failures, use concrete maps (conversations, turns, turnsByPost)
initialized deterministically in setup, and implement deterministic method logic
for Create/Update/Get/List so tests exercise real in-memory behavior rather than
mock failures; update tests to use the fixture type (referencing the type name
conversationFixture/inMemoryConversationStore and its methods) and ensure any
test expectations/assertions are adjusted to rely on the deterministic state
instead of injected errors.

---

Outside diff comments:
In `@api/api_test.go`:
- Around line 260-267: The test creates a non-nil conversationsService and
passes it into New(...), which causes isConversationOwner to use the
conversationsService path and bypass the seeded mock conversationStore; change
the setup so ownership resolution uses the mock store by passing nil for
conversationsService (i.e., set conversationsService = nil or stop passing it
into New) or alternatively seed the conversationsService with a test
implementation that delegates to mockConvStore; update the variable used in the
api construction (conversationsService in the New(...) call) so it consistently
resolves ownership via mockConvStore during the test.

---

Nitpick comments:
In `@api/api_test.go`:
- Around line 159-168: The mockConversationStore methods GetTurnByPostID and
UpdateTurnContent currently return nil,nil when a turn is missing which masks
not-found cases; change both methods to return a clear not-found error (e.g.,
errors.New("turn not found") or a package-level sentinel like ErrTurnNotFound)
when the postID/key is absent, preserve existing behavior of returning m.err
when set, and update any tests expecting nil results to assert the explicit
not-found error instead; ensure references to mockConversationStore,
GetTurnByPostID, UpdateTurnContent, and the chosen ErrTurnNotFound symbol are
used so callers can distinguish missing-turn errors from success.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: f619e496-0153-478d-9633-c999287c90d2

📥 Commits

Reviewing files that changed from the base of the PR and between bccfe75 and 7336ee3.

📒 Files selected for processing (7)
  • api/api.go
  • api/api_admin_test.go
  • api/api_test.go
  • server/main.go
  • store/store_test.go
  • webapp/src/client.tsx
  • webapp/src/index.tsx
✅ Files skipped from review due to trivial changes (1)
  • api/api_admin_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • store/store_test.go
  • api/api.go

Comment thread api/api_test.go
- Add title backfill from LLM_PostMeta before dropping table (migration 6)
- Clean duplicate sequences before adding UNIQUE constraint (migration 7)
- Add UserID index for conversation summaries query (migration 8)
- Add atomic CreateTurnAutoSequence with retry to prevent TOCTOU races
- Set conversation_id on posts before creation to fix frontend timing hole
- Split ProcessDMRequest so conversation is created before placeholder post
- Guard title generation to only fire on new conversations
- Remove duplicate LLM loop from HandleToolResult (visibility-only update)
- Extract shouldAutoExecuteTool to deduplicate 4 identical callbacks
- Replace correlated subquery with LEFT JOIN in summaries query
- Return ErrConversationNotFound from GetConversationByThreadAndBot
- Add frontend loading/error states for conversation fetch
Resolve conflicts:
- Delete mmtools/provider_test.go (tested SearchServer/LookupMattermostUser
  tools removed by master in #624)
- Renumber conversation-entities migrations (4-8 → 5-9) to avoid collision
  with master's migration 4 (create_custom_prompts_tables)
- Update migration count assertion in store_test.go (8 → 9)
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
streaming/streaming.go (1)

492-502: ⚠️ Potential issue | 🟠 Major

Keep messageBuilder in sync with cleanedMessage.

After the annotations event swaps in cleanedMessage, later EventTypeText chunks still append to the old builder contents. That can reintroduce citation markers into the rendered post while the persisted turn keeps the cleaned text.

Suggested fix
 						if cleanedMsg, hasCleaned := annotationMap["cleanedMessage"].(string); hasCleaned {
 							// Replace post message with cleaned version (citation markers removed)
+							messageBuilder.Reset()
+							messageBuilder.WriteString(cleanedMsg)
 							post.Message = cleanedMsg
 							p.sendPostStreamingUpdateEventWithBroadcast(post, post.Message, broadcast)
 							if acc != nil {
 								acc.text.Reset()
 								acc.text.WriteString(cleanedMsg)
 							}
 						}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@streaming/streaming.go` around lines 492 - 502, When swapping in
cleanedMessage inside the annotations handling (annotationMap -> cleanedMessage)
also update the streaming text builder so subsequent EventTypeText chunks append
to the cleaned content: set the builder used for text chunks (acc.text /
messageBuilder) to the cleanedMsg (you already reset and WriteString cleanedMsg
for acc.text) and ensure any other per-turn builder variable used elsewhere is
likewise reset/seeded with cleanedMsg; keep post.Message updated and continue
calling p.sendPostStreamingUpdateEventWithBroadcast(post, post.Message,
broadcast) after seeding the builder so future EventTypeText handling appends to
the cleaned text rather than the old contents.
♻️ Duplicate comments (4)
conversations/conversations_test.go (1)

121-174: ⚠️ Potential issue | 🟠 Major

This eval no longer exercises conversation mention handling.

Lines 121-174 bypass the conversations flow entirely: the service instance is discarded, the request is rebuilt by hand, and ChatCompletion is called directly. That means this test no longer covers the new conversation-layer assembly logic from the refactor. On top of that, Lines 165-168 still send only threadData.LatestPost().Message, so the long-thread attribution rubric is being evaluated without the earlier thread messages that mention handling depends on.

Please drive this through the new conversation service path, or at minimum rebuild posts from the full thread before appending the latest user message.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/conversations_test.go` around lines 121 - 174, The test
bypasses the conversations layer by discarding the conversations.New result and
calling llmInstance.ChatCompletion directly with posts containing only
threadData.LatestPost().Message; restore coverage by invoking the conversation
service path (use the conversations.New instance returned earlier and call its
public method that accepts the thread/request) or, if you must keep the direct
LLM call, rebuild the posts array from the full thread instead of only the
latest message (use the thread or threadData message list to append earlier
messages before LatestPost()), and remove the discarded service instance pattern
so the test exercises conversations.New and
contextBuilder.BuildLLMContextUserRequest end-to-end rather than calling
llmInstance.ChatCompletion directly.
toolrunner/toolrunner.go (2)

110-115: ⚠️ Potential issue | 🟠 Major

Apply the tool-failure guard before the first model call.

If request.Posts already ends with llm.MaxConsecutiveToolCallFailures failed tool turns, Line 114 still invokes the model with tools enabled once more. Precompute the guard before the initial ChatCompletion.

Minimal fix
 func (r *ToolRunner) Run(
 	request llm.CompletionRequest,
 	shouldExecute func(llm.ToolCall) bool,
 	onToolTurns func([]ToolTurn),
 	opts ...llm.LanguageModelOption,
 ) (*ToolRunResult, error) {
 	currentOpts := append([]llm.LanguageModelOption(nil), opts...)
+
+	if llm.CountTrailingFailedToolCalls(request.Posts) >= llm.MaxConsecutiveToolCallFailures {
+		request.Posts = llm.EnsureToolRetryLimitSystemMessage(request.Posts)
+		currentOpts = append(currentOpts, llm.WithToolsDisabled())
+	}
 
 	// Make the first LLM call synchronously so initialization errors
 	// (auth failures, rate limits, etc.) are returned directly.
 	firstStream, err := r.llm.ChatCompletion(request, currentOpts...)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@toolrunner/toolrunner.go` around lines 110 - 115, Precompute and apply the
tool-failure guard to the model options before making the first synchronous
call: check request.Posts for trailing failed tool turns against
llm.MaxConsecutiveToolCallFailures and, if the guard would disable tools, modify
currentOpts accordingly (the same way other calls do) before invoking
r.llm.ChatCompletion; ensure the logic that inspects request.Posts and toggles
tool-related options runs prior to the ChatCompletion call so the initial call
uses the guarded currentOpts.

275-277: ⚠️ Potential issue | 🟠 Major

Don’t turn MaxToolRounds exhaustion into a clean EOF.

Lines 275-277 currently emit only EventTypeEnd, so callers cannot distinguish a bailout from a legitimate empty assistant reply. Surface an error event before ending, or expose a sentinel on ToolRunResult.

One way to surface the bailout
 import (
+	"errors"
 	"encoding/json"
 	"fmt"
 	"strings"
 
 	"github.com/mattermost/mattermost-plugin-agents/llm"
 )
@@
 const MaxToolRounds = 10
+
+var ErrMaxToolRoundsExceeded = errors.New("max tool rounds exceeded")
@@
 	// Exhausted MaxToolRounds.
 	r.deliverToolTurns(result, onToolTurns)
+	output <- llm.TextStreamEvent{
+		Type:  llm.EventTypeError,
+		Value: ErrMaxToolRoundsExceeded,
+	}
 	output <- llm.TextStreamEvent{Type: llm.EventTypeEnd}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@toolrunner/toolrunner.go` around lines 275 - 277, The code currently treats
exhausting MaxToolRounds as a normal end by only sending
llm.TextStreamEvent{Type: llm.EventTypeEnd}; change this so callers can detect
the bailout by emitting an error event (e.g. llm.TextStreamEvent with Type set
to EventTypeError and a descriptive error message like "max tool rounds
exhausted") on the same output channel before sending EventTypeEnd, or
alternatively set a sentinel flag on the ToolRunResult (e.g.,
ToolRunResult.MaxRoundsExceeded) and ensure r.deliverToolTurns/onToolTurns
populates it so callers can inspect the result; update the code path that
handles MaxToolRounds exhaustion to produce one of these signals rather than
only EventTypeEnd.
store/migrations/000007_drop_post_meta_table.down.sql (1)

3-9: ⚠️ Potential issue | 🟠 Major

Backfill LLM_PostMeta during rollback.

This recreates the legacy table but leaves it empty, so a one-step rollback drops every saved title even though LLM_Conversations still has RootPostID + Title. Restore those rows before cleaning up the old FK.

Possible rollback backfill
 CREATE TABLE IF NOT EXISTS LLM_PostMeta (
     RootPostID TEXT NOT NULL PRIMARY KEY,
     Title TEXT NOT NULL
 );
 
+INSERT INTO LLM_PostMeta (RootPostID, Title)
+SELECT RootPostID, Title
+FROM LLM_Conversations
+WHERE RootPostID IS NOT NULL
+ON CONFLICT (RootPostID) DO UPDATE SET Title = EXCLUDED.Title;
+
 -- Clean up FK constraint from old LLM_Threads table if it exists
 ALTER TABLE IF EXISTS LLM_Threads DROP CONSTRAINT IF EXISTS llm_threads_rootpostid_fkey;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@store/migrations/000007_drop_post_meta_table.down.sql` around lines 3 - 9,
When recreating LLM_PostMeta in the down migration, backfill it from
LLM_Conversations before altering/dropping the FK: after CREATE TABLE IF NOT
EXISTS LLM_PostMeta (RootPostID, Title) run an INSERT INTO LLM_PostMeta
(RootPostID, Title) SELECT RootPostID, Title FROM LLM_Conversations WHERE
RootPostID IS NOT NULL AND NOT EXISTS (SELECT 1 FROM LLM_PostMeta pm WHERE
pm.RootPostID = LLM_Conversations.RootPostID) so existing rows are preserved,
then proceed to ALTER TABLE IF EXISTS LLM_Threads DROP CONSTRAINT IF EXISTS
llm_threads_rootpostid_fkey.
🧹 Nitpick comments (1)
conversations/tool_approval.go (1)

256-337: Consider extracting duplicate user-preference and tool-disabling logic.

The logic for loading user preferences, filtering disabled servers, and applying channel tool-disabling rules (lines 266-298) is nearly identical to the logic in HandleToolCall (lines 66-79). This duplication could lead to inconsistencies if one is updated without the other.

♻️ Suggested approach

Consider extracting a helper method like buildToolContextForConversation(bot, user, channel, isDM) that encapsulates:

  1. Building LLM context with default tools
  2. Loading and applying user-disabled-provider filtering
  3. Applying channel tool-disabling rules

This would ensure consistent behaviour across both HandleToolCall and streamToolFollowUp.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@conversations/tool_approval.go` around lines 256 - 337, Extract the
duplicated user-preference and tool-disabling logic into a helper (e.g.
buildToolContextForConversation) and replace the inlined code in
streamToolFollowUp and HandleToolCall with calls to it; the helper should accept
(bot *bots.Bot, user *model.User, channel *model.Channel, isDM bool) and return
the prepared llm.Context (or llmContext) including: building the LLM context
with c.contextBuilder.WithLLMContextDefaultTools, loading prefs via
mcp.LoadUserPreferences and calling llmContext.Tools.RemoveToolsByServerOrigin
when needed, computing toolsDisabled per channel/config
(EnableChannelMentionToolCalling, AllowNativeWebSearchInChannels) and setting
llmContext.DisabledToolsInfo when tools are disabled, and preserving any
channel-root bot-activation checks by allowing callers to apply
applyBotChannelAutoEverywhereToolFilter if required.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/api_post.go`:
- Around line 170-177: The nil/empty SiteURL guard is dead code that blocks
analysis; remove the check that aborts when
a.pluginAPI.Configuration.GetConfig().ServiceSettings.SiteURL is nil or empty
and allow the flow to create the analysis post and call a.makeAnalysisPost and
a.streamingService.StreamToNewDM even when SiteURL is unset (or alternatively
rewire the ServiceSettings.SiteURL into the post creation if you need the
value); apply the same removal/fix for the other occurrence around the
a.makeAnalysisPost call later in the file (the block currently returning
http.StatusInternalServerError).

In `@conversations/dm_conversation_test.go`:
- Around line 118-122: CreateTurnAutoSequence has a race because it calls
GetMaxSequenceForConversation (which locks internally) and then CreateTurn
separately, allowing other goroutines to change sequences between calls; fix by
performing the sequence calculation and the creation under the same
fakeConvStore mutex so they are atomic: acquire the fakeConvStore lock (the same
mutex used by GetMaxSequenceForConversation/ CreateTurn), compute max sequence
from the in-memory store and set turn.Sequence, then call CreateTurn (or insert
the turn directly) while still holding that lock to prevent interleaving
updates.

In `@conversations/regeneration.go`:
- Around line 35-46: The unconditional conversation_id ownership check prevents
the other regeneration paths from executing; change the logic so you only
perform the c.convService / conversation lookup when the conversation path is
actually intended: first detect whether the post has a non-empty
streaming.ConversationIDProp (via post.GetProp) and only then require
c.convService and call c.convService.GetConversation, otherwise skip this block
so the referenced_recording_file_id and referenced_transcript_post_id branches
can run; alternatively treat conversation ownership as required only when
neither referenced_recording_file_id nor referenced_transcript_post_id are
present. Ensure you reference and guard around c.convService and
streaming.ConversationIDProp/GetConversation accordingly.
- Around line 237-258: The code uses mmapi.IsDMWith (isDM) to decide
toolsDisabled and shared behavior which only matches 1:1 DMs and misclassifies
group DMs; change the logic to use channel.Type (check for
model.ChannelTypeDirect || model.ChannelTypeGroup) wherever isDM is used for
tool/shared decisions (replace isDM-based checks that set toolsDisabled and any
shared/unshared flags with a channel-type check), preserve the existing override
via c.configProvider.EnableChannelMentionToolCalling() but apply it only when
the channel is not direct/group, and ensure llmContext.DisabledToolsInfo
handling (llmContext.Tools.GetToolsInfo()) follows the updated toolsDisabled
value; update the identical logic at the other occurrence noted in the review as
well.

In `@store/conversations.go`:
- Around line 67-70: The current unique-violation handling returns
ErrConversationConflict for any 23505 error; change the logic to check the
specific constraint name (e.g., "idx_llm_conversations_thread_bot") before
mapping to ErrConversationConflict. Inspect the underlying Postgres error (use
pgerror/pgconn to read ConstraintName or extend isUniqueViolation to return the
constraint name), and only return ErrConversationConflict when that constraint
matches; otherwise propagate the original err (so a duplicate ID still returns
the original DB error).

In `@store/migrations/000007_drop_post_meta_table.up.sql`:
- Around line 4-5: The UpdatedAt assignment uses "EXTRACT(EPOCH FROM
NOW())::BIGINT * 1000" which casts seconds to BIGINT before multiplying,
dropping milliseconds; change it to multiply first and then cast, e.g.
"(EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT", updating the UpdatedAt expression
in the SET clause (the line with UpdatedAt = EXTRACT(EPOCH FROM NOW())::BIGINT *
1000).

In `@streaming/streaming.go`:
- Around line 123-133: The tool-use blocks in the StreamToPost path are being
persisted with Shared always set to true; replace the hardcoded
conversation.BoolPtr(true) in the toolCalls loop (where a.toolCalls is converted
into conversation.ContentBlock of Type BlockTypeToolUse) with the
stream/post-level shared flag (i.e., use the StreamToPost instance's shared
property or the incoming post options flag—e.g., conversation.BoolPtr(a.shared)
or the equivalent variable used to indicate whether this stream is a channel
post) and make the same change for the other occurrence noted (around the second
block creation at lines ~468-478) so tool_use blocks respect the actual
shared/unshared setting.

In `@streaming/turn_persistence_test.go`:
- Around line 132-167: The test preloads turns with Sequence values 0,1,2 which
conflicts with the model's 1-based Sequence; change the seeded sequences in
ts.turns (created in the loop) to 1,2,3 (e.g., Sequence: i+1) and update the
assertion to expect the new turn's Sequence to be 4 (require.Equal(t, 4,
ts.turns[3].Sequence)); keep references to ts.turns, the Turn.Sequence field,
and the call to service.StreamToPost to locate the change.

---

Outside diff comments:
In `@streaming/streaming.go`:
- Around line 492-502: When swapping in cleanedMessage inside the annotations
handling (annotationMap -> cleanedMessage) also update the streaming text
builder so subsequent EventTypeText chunks append to the cleaned content: set
the builder used for text chunks (acc.text / messageBuilder) to the cleanedMsg
(you already reset and WriteString cleanedMsg for acc.text) and ensure any other
per-turn builder variable used elsewhere is likewise reset/seeded with
cleanedMsg; keep post.Message updated and continue calling
p.sendPostStreamingUpdateEventWithBroadcast(post, post.Message, broadcast) after
seeding the builder so future EventTypeText handling appends to the cleaned text
rather than the old contents.

---

Duplicate comments:
In `@conversations/conversations_test.go`:
- Around line 121-174: The test bypasses the conversations layer by discarding
the conversations.New result and calling llmInstance.ChatCompletion directly
with posts containing only threadData.LatestPost().Message; restore coverage by
invoking the conversation service path (use the conversations.New instance
returned earlier and call its public method that accepts the thread/request) or,
if you must keep the direct LLM call, rebuild the posts array from the full
thread instead of only the latest message (use the thread or threadData message
list to append earlier messages before LatestPost()), and remove the discarded
service instance pattern so the test exercises conversations.New and
contextBuilder.BuildLLMContextUserRequest end-to-end rather than calling
llmInstance.ChatCompletion directly.

In `@store/migrations/000007_drop_post_meta_table.down.sql`:
- Around line 3-9: When recreating LLM_PostMeta in the down migration, backfill
it from LLM_Conversations before altering/dropping the FK: after CREATE TABLE IF
NOT EXISTS LLM_PostMeta (RootPostID, Title) run an INSERT INTO LLM_PostMeta
(RootPostID, Title) SELECT RootPostID, Title FROM LLM_Conversations WHERE
RootPostID IS NOT NULL AND NOT EXISTS (SELECT 1 FROM LLM_PostMeta pm WHERE
pm.RootPostID = LLM_Conversations.RootPostID) so existing rows are preserved,
then proceed to ALTER TABLE IF EXISTS LLM_Threads DROP CONSTRAINT IF EXISTS
llm_threads_rootpostid_fkey.

In `@toolrunner/toolrunner.go`:
- Around line 110-115: Precompute and apply the tool-failure guard to the model
options before making the first synchronous call: check request.Posts for
trailing failed tool turns against llm.MaxConsecutiveToolCallFailures and, if
the guard would disable tools, modify currentOpts accordingly (the same way
other calls do) before invoking r.llm.ChatCompletion; ensure the logic that
inspects request.Posts and toggles tool-related options runs prior to the
ChatCompletion call so the initial call uses the guarded currentOpts.
- Around line 275-277: The code currently treats exhausting MaxToolRounds as a
normal end by only sending llm.TextStreamEvent{Type: llm.EventTypeEnd}; change
this so callers can detect the bailout by emitting an error event (e.g.
llm.TextStreamEvent with Type set to EventTypeError and a descriptive error
message like "max tool rounds exhausted") on the same output channel before
sending EventTypeEnd, or alternatively set a sentinel flag on the ToolRunResult
(e.g., ToolRunResult.MaxRoundsExceeded) and ensure
r.deliverToolTurns/onToolTurns populates it so callers can inspect the result;
update the code path that handles MaxToolRounds exhaustion to produce one of
these signals rather than only EventTypeEnd.

---

Nitpick comments:
In `@conversations/tool_approval.go`:
- Around line 256-337: Extract the duplicated user-preference and tool-disabling
logic into a helper (e.g. buildToolContextForConversation) and replace the
inlined code in streamToolFollowUp and HandleToolCall with calls to it; the
helper should accept (bot *bots.Bot, user *model.User, channel *model.Channel,
isDM bool) and return the prepared llm.Context (or llmContext) including:
building the LLM context with c.contextBuilder.WithLLMContextDefaultTools,
loading prefs via mcp.LoadUserPreferences and calling
llmContext.Tools.RemoveToolsByServerOrigin when needed, computing toolsDisabled
per channel/config (EnableChannelMentionToolCalling,
AllowNativeWebSearchInChannels) and setting llmContext.DisabledToolsInfo when
tools are disabled, and preserving any channel-root bot-activation checks by
allowing callers to apply applyBotChannelAutoEverywhereToolFilter if required.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: d3845103-ba1e-4274-a44a-e14b0bd41c6a

📥 Commits

Reviewing files that changed from the base of the PR and between 7336ee3 and b187698.

📒 Files selected for processing (40)
  • api/api_no_tools_test.go
  • api/api_post.go
  • channels/analysis_conversation_test.go
  • channels/channels.go
  • channels/channels_eval_test.go
  • conversation/helpers.go
  • conversation/service.go
  • conversations/conversations.go
  • conversations/conversations_test.go
  • conversations/dm_conversation_test.go
  • conversations/handle_messages.go
  • conversations/regeneration.go
  • conversations/store.go
  • conversations/tool_approval.go
  • mcpserver/tools_eval_test.go
  • search/search_conversation_test.go
  • server/main.go
  • store/conversations.go
  • store/conversations_test.go
  • store/migrations/000005_create_conversations_table.down.sql
  • store/migrations/000005_create_conversations_table.up.sql
  • store/migrations/000006_create_turns_table.down.sql
  • store/migrations/000006_create_turns_table.up.sql
  • store/migrations/000007_drop_post_meta_table.down.sql
  • store/migrations/000007_drop_post_meta_table.up.sql
  • store/migrations/000008_add_unique_sequence_constraint.down.sql
  • store/migrations/000008_add_unique_sequence_constraint.up.sql
  • store/migrations/000009_add_userid_index.down.sql
  • store/migrations/000009_add_userid_index.up.sql
  • store/migrations/000010_fix_conversations_unique_index_soft_delete.down.sql
  • store/migrations/000010_fix_conversations_unique_index_soft_delete.up.sql
  • store/store_test.go
  • store/turns.go
  • streaming/streaming.go
  • streaming/turn_persistence_test.go
  • toolrunner/toolrunner.go
  • toolrunner/toolrunner_test.go
  • webapp/src/components/llmbot_post/llmbot_post.tsx
  • webapp/src/components/llmbot_post/turn_content_utils.ts
  • webapp/src/types/conversation.ts
✅ Files skipped from review due to trivial changes (11)
  • store/migrations/000006_create_turns_table.down.sql
  • store/migrations/000009_add_userid_index.down.sql
  • store/migrations/000005_create_conversations_table.down.sql
  • store/migrations/000010_fix_conversations_unique_index_soft_delete.down.sql
  • store/migrations/000006_create_turns_table.up.sql
  • store/migrations/000005_create_conversations_table.up.sql
  • api/api_no_tools_test.go
  • store/migrations/000009_add_userid_index.up.sql
  • search/search_conversation_test.go
  • toolrunner/toolrunner_test.go
  • server/main.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • store/store_test.go
  • channels/channels_eval_test.go
  • mcpserver/tools_eval_test.go
  • conversations/store.go

Comment thread api/api_post.go
Comment thread conversations/dm_conversation_test.go
Comment thread conversations/regeneration.go
Comment thread conversations/regeneration.go
Comment thread store/conversations.go
Comment thread store/migrations/000007_drop_post_meta_table.up.sql Outdated
Comment thread streaming/streaming.go Outdated
Comment thread streaming/turn_persistence_test.go
@mattermost mattermost deleted a comment from dryrunsecurity bot Apr 16, 2026
@dryrunsecurity
Copy link
Copy Markdown

dryrunsecurity bot commented Apr 16, 2026

DryRun Security

This pull request includes code that builds LLM input by embedding thread posts as user-role messages without explicit boundary markers, which the scanner flags as a high-severity indirect prompt injection risk because a malicious post could inject instructions or jailbreak the model.

🟠 Indirect Prompt Injection in conversation/service.go (drs_9266b5dc)
Vulnerability Indirect Prompt Injection
Description The 'BuildChannelMentionRequest' function constructs a sequence of messages for the LLM by formatting thread posts from users. It wraps each post content with '@username: ' and places it within a message under the 'llm.PostRoleUser' role. Because this message is treated as a user input block without explicit, robust boundaries (like XML-style tags) or separation from the system/bot turn structure, a malicious user could craft a message that tricks the LLM into thinking it has received new instructions, potentially attempting to bypass safety controls or execute unauthorized tools. While tools are filtered for channel mentions, the absence of boundary indicators makes the LLM prone to 'jailbreak' style prompt injection if it processes malicious thread content.

Message: "@" + username + ": " + format.PostBody(threadPost),
})
}
}


Comment to provide feedback on these findings.

Report false positive: @dryrunsecurity fp [FINDING ID] [FEEDBACK]
Report low-impact: @dryrunsecurity nit [FINDING ID] [FEEDBACK]

Example: @dryrunsecurity fp drs_90eda195 This code is not user-facing

All finding details can be found in the DryRun Security Dashboard.

Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — Conversation Entities PR

Reviewed the full diff (~20k additions, ~9k deletions) covering the new conversation entity model, API endpoints, tool approval flow, streaming turn persistence, store layer, and webapp changes.

Assessment

No high-confidence exploitable vulnerabilities found. The PR demonstrates strong security design in several areas:

Authorization model (strong):

  • GET /conversations/{id} correctly checks PermissionReadChannel for channel-linked conversations and restricts threadless conversations to the owner only.
  • Tool approval (HandleToolCall, HandleToolResult) validates conv.UserID != userID against the server-stored conversation entity, not a client-supplied value.
  • isConversationOwner() reads the conversation_id prop from a server-fetched post (via postAuthorizationRequired middleware), not from user input.
  • The postAuthorizationRequired middleware verifies channel read permission before any post-scoped handler runs.

Privacy filtering (strong):

  • Non-requester viewers get tool inputs redacted (Input set to nil) and tool results redacted (Content set to empty) via FilterForNonRequester.
  • WebSocket tool call broadcasts are scoped: full data to requesterUserID only, redacted data to other channel members via OmitUsers.
  • SanitizeForDisplay strips Unicode bidi/spoofing characters from tool-generated content.
  • SystemPrompt is stored in the DB but correctly excluded from the ConversationResponse API shape.

Store layer (strong):

  • All SQL uses parameterized queries via squirrel builder or explicit $N placeholders. No raw string interpolation.
  • GetConversationSummariesForUser filters by c.UserID from the authenticated session header.

Tool execution (strong):

  • ToolRunner executes tools via Context.Tools.ResolveTool, which uses the requesting user's session (set by the context builder). Tool execution stays within the requesting user's permission scope.
  • MaxToolRounds = 10 prevents infinite tool-call loops from adversarial LLM output.

Low-severity observations (not actionable vulnerabilities)

  1. handleGetAIThreads scope change: The old implementation filtered by bot DM channel IDs. The new GetConversationSummariesForUser returns all non-deleted conversations for the user, including channel-mention conversations. This is by design (the user is always the original requester), but the response now exposes ChannelID and RootPostID for conversations in channels the user may have since been removed from. These are opaque IDs (not content), and the conversation API itself enforces channel membership at access time, so content cannot leak through this path.

  2. System prompt in DB: The system prompt (which may contain bot instructions or configuration) is stored in LLM_Conversations.SystemPrompt. It is correctly excluded from the API response. However, any DB backup or direct DB access would expose these. This is a deployment-level concern, not a code vulnerability.

No findings warrant inline comments. The security posture of this PR is solid.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — Conversation Entities PR

Reviewed the full diff (~21k additions, ~9.4k deletions) covering the new conversation entity model, API endpoints, tool approval flow, streaming turn persistence, ToolRunner, store layer, and webapp changes.

Assessment

No high-confidence exploitable vulnerabilities found. The PR demonstrates strong security design across the new conversation entity model:

Authorization model (strong):

  • GET /conversations/{id} correctly checks PermissionReadChannel for channel-linked conversations and restricts threadless conversations to the owner only (api/api_conversation.go lines 56-67).
  • Tool approval/rejection (HandleToolCall, HandleToolResult) validates conv.UserID != userID against the server-stored conversation entity, not client-supplied values (conversations/tool_approval.go lines 49, 209).
  • isConversationOwner() reads conversation_id from a server-fetched post (loaded via postAuthorizationRequired middleware which verifies channel membership), then checks the conversation entity's UserID against the session user. The post prop is set server-side during response creation and is not exposed to client modification through the plugin API.
  • Regeneration enforces ownership via the conversation entity with a fail-closed pattern (conversations/regeneration.go lines 35-48).

Privacy filtering (strong):

  • Non-requester viewers get tool inputs redacted (Input set to nil) and tool results redacted (Content set to empty) via FilterForNonRequester (conversation/content_block.go).
  • WebSocket tool call broadcasts are correctly scoped: full data to requesterUserID only via UserId broadcast field, redacted data to other channel members via OmitUsers (streaming/streaming.go broadcastToolCalls).
  • SanitizeForDisplay strips Unicode bidi/spoofing characters from tool-generated content before API responses.
  • SystemPrompt is stored in the DB but correctly excluded from the ConversationResponse API shape.
  • RedactUnsharedToolContent in BuildCompletionRequest prevents private tool output from leaking into channel-visible LLM replies when the follow-up streams in channels.

Store layer (strong):

  • All SQL uses parameterized queries via squirrel builder or explicit $N placeholders. No raw string interpolation.
  • GetConversationSummariesForUser filters by c.UserID from the authenticated session header.
  • CreateTurnAutoSequence correctly handles race conditions with retry logic on unique constraint violations.

Tool execution (strong):

  • The new ToolRunner executes tools via Context.Tools.ResolveTool, which operates under the requesting user's session (set by the context builder). Tool execution stays within the requesting user's permission scope.
  • MaxToolRounds = 10 prevents infinite tool-call loops from adversarial LLM output.
  • In channels, tool follow-up is gated: HandleToolCall returns without streaming for non-DM channels (line 181-183), requiring explicit HandleToolResult approval before the LLM generates a channel-visible response.

Migration security:

  • Unique index idx_llm_conversations_thread_bot_user on (RootPostID, BotID, UserID) ensures per-user conversation isolation in shared threads, preventing cross-user conversation hijacking.

Low-severity observations (not actionable vulnerabilities)

  1. handleGetAIThreads scope expansion: The new GetConversationSummariesForUser returns all non-deleted conversations for the user (including channel-mention conversations), exposing ChannelID and RootPostID as opaque IDs. The conversation API itself enforces channel membership at content access time, so no content leaks through this path. The user is always the original requester of these conversations.

  2. System prompt in DB: The system prompt is stored in LLM_Conversations.SystemPrompt. It is correctly excluded from the API response, but any DB backup or direct DB access would expose these. This is a deployment-level concern, not a code vulnerability.

No findings warrant inline comments. The security posture of this PR is solid.

Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale comment

Security Review — Conversation Entities PR

Scope: Reviewed all 97 changed files (21,579 additions, 9,501 deletions) covering the new conversation entity model, API endpoints, tool approval flow, streaming/WebSocket, store layer, and search/analysis paths.

Previous Findings (from earlier automation)

Two open findings from a prior review were re-validated against the current code:

  1. Race condition in GetOrCreateConversationNow addressed. conversation/service.go:220-238 handles ErrConversationConflict by retrying the lookup and appending the user turn to the concurrently-created conversation.

  2. Tool-result approvals not appliedNow addressed. conversations/tool_approval.go:206-246 properly builds an acceptedSet from acceptedToolIDs, iterates all turns, and flips Shared flags on matching tool_use and tool_result blocks before streaming the follow-up.

Current Assessment

No new medium, high, or critical findings. The security model is well-implemented:

  • API authentication: Mattermost-User-Id from server-injected header; never from request body/query.
  • Conversation access control: GET /conversations/{id} checks PermissionReadChannel for channel conversations; owner-only for threadless/DM conversations. SystemPrompt correctly excluded from the API response type.
  • Privacy filtering: Non-owners receive FilterForNonRequester redaction (tool inputs/results cleared unless shared=true), plus SanitizeForDisplay for bidi/spoofing defense.
  • Tool approval ownership: Both HandleToolCall and HandleToolResult enforce conv.UserID == userID.
  • WebSocket scoping: broadcastToolCalls sends full tool data only to the requester (UserId: requesterUserID); all other channel members receive redactToolCalls (names and status only, no arguments/results).
  • Channel tool privacy: Tool blocks default to shared=false in channels; DM tools default to shared=true. The RedactUnsharedToolContent option in BuildCompletionRequest prevents unshared tool content from reaching the LLM in channel follow-up requests.
  • Regeneration ownership: HandleRegenerate verifies conv.UserID != userID before proceeding.
  • SQL parameterization: All store queries use Squirrel query builder or explicit $N parameters — no string interpolation.
  • Search scoping: executeSearch passes UserID for embedding-level permission scoping.
  • Listing isolation: GetConversationSummariesForUser filters by UserID from the session.
Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Review — No Findings

Reviewed the full diff (97 files, ~21.8k additions / ~9.5k deletions) covering:

  • New GET /conversations/{id} API endpoint and authorization model
  • Conversation entity creation, ownership, and turn storage
  • Tool approval/rejection flow (HandleToolCall, HandleToolResult)
  • Privacy filtering (FilterForNonRequester, SanitizeForDisplay)
  • Streaming per-user broadcast for tool call redaction
  • Store layer SQL queries (parameterized via Squirrel / bound placeholders)
  • Search scoping via pgvector ChannelMembers JOIN
  • ToolRunner auto-execution and bot_channel_tool_filter
  • Regeneration ownership checks (fail-closed)

No medium, high, or critical vulnerabilities found. Key security properties verified:

Area Assessment
Authentication Mattermost-User-Id from server-injected header throughout; no body/query-param identity sources
Conversation API authz Channel-scoped: HasPermissionToChannel; threadless: owner-only; non-owner gets tool redaction
Tool approval authz Double-layered: API isConversationOwner + service conv.UserID == userID; pending turn must match clicked post
Privacy filtering Default shared=false for channel tool blocks; FilterForNonRequester redacts unshared tool I/O; other block types pass through (by design — they are channel-visible bot reply content)
Streaming broadcastToolCalls splits full-data (requester only) vs. redacted (rest of channel)
Search scoping pgvector requires non-empty UserID; ChannelMembers JOIN enforces membership
SQL injection Squirrel builder + bound placeholders; raw SQL uses positional params
Auto-execution Channel follow-up applies toolsDisabled + WithToolsDisabled() when config is off; bot_activate_ai path restricted to auto_run_everywhere tools via applyBotChannelAutoEverywhereToolFilter
Regeneration Fails closed when convService is nil or conversation_id is missing
Open in Web View Automation 

Sent by Cursor Automation: Find vulnerabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant