Skip to content

Searches all open conversations+branhces for the text#61

Open
vale-sqr wants to merge 5 commits into
anima-research:mainfrom
vale-sqr:search/searches-conversations
Open

Searches all open conversations+branhces for the text#61
vale-sqr wants to merge 5 commits into
anima-research:mainfrom
vale-sqr:search/searches-conversations

Conversation

@vale-sqr
Copy link
Copy Markdown
Contributor

Created a Search feature. Will search through all the open conversations, and open correct branch in the conversation.

@antra-tess
Copy link
Copy Markdown
Contributor

Review: Search Implementation

Thanks for adding search - this is a valuable feature! However, the current implementation has a scaling issue that needs to be addressed before merge.

Problem

The current approach loads all conversations and all messages into memory on every search:

const conversations = await db.getUserConversations(req.userId);
for (const conversation of conversations) {
  const messages = await db.getConversationMessages(conversation.id, req.userId);
  // ...
}

This won't scale with our conversation volume.


Proposed Solution: Per-Conversation Search Indexes

Instead of scanning content on every query, maintain a lightweight search index per conversation:

Structure

data/conversations/
  {convId}.jsonl           # existing: event log
  {convId}.idx.json        # new: search index

Index format

{
  "v": 1,
  "entries": [
    {
      "m": "messageId",
      "b": "branchId",
      "r": "assistant",
      "t": "lowercased searchable text...",
      "ts": 1706500000000
    }
  ]
}

Lifecycle

Event Action
Conversation loaded Build index from messages, keep in memory
Message added/edited Update in-memory index, mark dirty
Dirty + 5s idle (debounced) Flush to disk
Server shutdown Flush all dirty indexes
Search (conv not loaded) Read .idx.json from disk (don't load full conversation)

Search flow

async searchAllConversations(userId: string, query: string, limit: number) {
  const conversations = await db.getUserConversations(userId);
  const results = [];
  const q = query.toLowerCase();
  
  for (const conv of conversations) {
    // Use in-memory index if loaded, otherwise read from disk
    const index = this.loadedIndexes.get(conv.id) 
      ?? await this.readIndexFromDisk(conv.id);
    
    for (const entry of index.entries) {
      if (entry.t.includes(q)) {
        results.push({ conversationId: conv.id, ...entry });
        if (results.length >= limit) break;
      }
    }
    if (results.length >= limit) break;
  }
  
  return results.sort((a, b) => b.ts - a.ts).slice(0, limit);
}

Why this approach?

  • No new dependencies - just JSON files
  • Fits existing architecture - mirrors the JSONL pattern
  • Lazy loading - only reads index files, not full conversations
  • Fast for loaded conversations - in-memory search
  • Survives restarts - indexes persisted to disk

Also

Minor issues with the current PR (can be fixed alongside the above):

  1. Remove content from response - only snippet is used in the UI
  2. Hardcoded dark-mode colors - use theme variables instead of rgba(255, 255, 255, ...)

Let me know if you have questions about this approach!

Copy link
Copy Markdown
Contributor

@antra-tess antra-tess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs indexed search approach for scaling - see detailed comment.

@vale-sqr
Copy link
Copy Markdown
Contributor Author

vale-sqr commented Feb 4, 2026

Added indexing, when searching search text will be compared to the index library, and only when selecting the conversation from the search bar will the full conversation be loaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants