[SSP] Integrate alt selected samples with similarity search panel#7307
Conversation
…loading similarity panel
WalkthroughChanges add explicit support for negative (alternate) sample selections across the similarity-search stack. Selection state now distinguishes default (positive) vs alt (negative) samples and exposes Sequence Diagram(s)sequenceDiagram
participant User as User
participant UI as NewSearch UI
participant Selection as useSearchSelection
participant Form as useNewSearchForm
participant Submission as useSearchSubmission
participant Backend as operators.py
User->>UI: select samples (type: default or alt)
UI->>Selection: update selectedSamples
Selection->>Selection: derive queryIds (type=="default")
Selection->>Selection: derive negativeQueryIds (type=="alt")
Selection-->>Form: expose queryIds, negativeQueryIds
User->>UI: choose query type (text/image)
UI-->>Form: set form.queryType
User->>UI: submit search
Form->>Submission: submit with queryIds & negativeQueryIds
Submission->>Backend: buildExecutionParams(queryIds, negativeQueryIds)
Backend->>Backend: compute pos_mean, neg_mean
alt negatives exist
Backend->>Backend: combined = 2*pos_mean - neg_mean
else no negatives
Backend->>Backend: combined = pos_mean
end
Backend->>Backend: combined = combined / ||combined|| (L2 normalize if norm>0)
Backend-->>Submission: return normalized vector
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/packages/similarity-search/src/hooks/useNewSearchForm.ts`:
- Around line 27-37: In useNewSearchForm remove the redundant conditional
branches in the useMemo computations for defaultBrainKey and defaultQueryType:
both branches that check hasSamplesSelected return the same value, so simplify
defaultBrainKey to derive its value only from cloneConfig?.brain_key and
brainKeys[0]?.key (use those in the dependency array) and simplify
defaultQueryType to return cloneConfig?.query_type ?? "image". Update the
useMemo declarations for defaultBrainKey and defaultQueryType accordingly and
keep cloneConfig, brainKeys, and hasSamplesSelected in dependencies only as
necessary.
In `@app/packages/similarity-search/src/hooks/useSearchSelection.ts`:
- Around line 34-41: Add a short clarifying comment above the negativeQueryIds
useMemo explaining that it intentionally only derives IDs from selectedSamples
with type === "alt" because selectedLabels does not carry type information
(default vs alt) and therefore cannot be used for negative selection; reference
the negativeQueryIds variable and selectedSamples/selectedLabels so reviewers
can find the logic in useSearchSelection.ts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5c0a7554-34e6-46ee-97c4-795f39be85b3
📒 Files selected for processing (7)
app/packages/similarity-search/src/components/Home/RunActions.tsxapp/packages/similarity-search/src/components/NewSearch/NewSearch.tsxapp/packages/similarity-search/src/hooks/useNewSearchForm.tsapp/packages/similarity-search/src/hooks/useSearchSelection.tsapp/packages/similarity-search/src/hooks/useSearchSubmission.tsplugins/panels/similarity_search/__init__.pyplugins/panels/similarity_search/operators.py
imanjra
left a comment
There was a problem hiding this comment.
Left minor suggestions but LGTM
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@plugins/panels/similarity_search/operators.py`:
- Around line 218-219: Update the inline documentation to match the weighted
Qdrant-style formula used in the implementation: find the comment block
containing "Qdrant-style: query = avg(pos) + (avg(pos) - avg(neg))" and replace
any occurrences elsewhere in this file that describe the simpler "avg(pos) -
avg(neg)" with the correct weighted expression "2 * avg(pos) - avg(neg)"; also
update any related docstrings or explanatory text in functions/methods that
reference the old formula (search for strings like "avg(pos) - avg(neg)" or
"Qdrant-style") so docs and code are consistent.
- Around line 220-224: The code silently ignores negative intent when
negative_ids were supplied but no neg_embeddings were found; change the logic
that sets combined (currently using neg_embeddings, pos_mean, combined) to first
check whether negative_ids was provided and, if so and neg_embeddings is empty,
surface an error (e.g., raise ValueError or return an explicit error) instead of
falling back to pos_mean; only fall back to pos_mean when negative_ids was not
supplied/empty. Update the block that computes neg_mean and combined to detect
the mismatch between negative_ids and neg_embeddings and raise/log accordingly
so callers know negative intent was lost.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7fc6d58f-16d1-40fc-9c03-4d1ae6982828
📒 Files selected for processing (1)
plugins/panels/similarity_search/operators.py
| if neg_embeddings: | ||
| neg_mean = np.mean(neg_embeddings, axis=0) | ||
| return pos_mean - neg_mean | ||
| combined = 2 * pos_mean - neg_mean | ||
| else: | ||
| return pos_mean | ||
| combined = pos_mean |
There was a problem hiding this comment.
Avoid silently discarding negative intent when no negative embeddings resolve.
When negative_ids is provided but all IDs are missing, Line 224 falls back to pos_mean, effectively converting a pos+neg search into pos-only without signaling the caller.
Proposed fix
- neg_embeddings = []
- if negative_ids:
+ neg_embeddings = []
+ has_negative_ids = bool(negative_ids)
+ if has_negative_ids:
embeddings, _, _ = results.get_embeddings(
**{id_key: negative_ids}, allow_missing=True
)
neg_embeddings = [np.asarray(e) for e in embeddings]
@@
- if neg_embeddings:
+ if has_negative_ids and not neg_embeddings:
+ raise ValueError("No embeddings found for negative samples")
+
+ if neg_embeddings:
neg_mean = np.mean(neg_embeddings, axis=0)
combined = 2 * pos_mean - neg_mean
else:
combined = pos_mean🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@plugins/panels/similarity_search/operators.py` around lines 220 - 224, The
code silently ignores negative intent when negative_ids were supplied but no
neg_embeddings were found; change the logic that sets combined (currently using
neg_embeddings, pos_mean, combined) to first check whether negative_ids was
provided and, if so and neg_embeddings is empty, surface an error (e.g., raise
ValueError or return an explicit error) instead of falling back to pos_mean;
only fall back to pos_mean when negative_ids was not supplied/empty. Update the
block that computes neg_mean and combined to detect the mismatch between
negative_ids and neg_embeddings and raise/log accordingly so callers know
negative intent was lost.
There was a problem hiding this comment.
when only negative_ids are selected for image search, we disabled the form submission in the FE.
There was a problem hiding this comment.
@lanzhenw, thanks for the clarification — that guards against the case where only negatives are submitted. However, my concern is a slightly different edge case: the user selects both positive and negative samples, but all the negative sample IDs happen to be absent from the similarity index (e.g., embeddings were never computed for them). With allow_missing=True, neg_embeddings ends up empty and the code silently falls back to pos_mean, dropping the negative intent entirely without any warning.
If that scenario is considered acceptable (e.g., the index is assumed to be always complete), feel free to dismiss. Otherwise, a targeted guard like this would make the behavior explicit:
if has_negative_ids and not neg_embeddings:
logger.warning(
"negative_ids were provided but no embeddings were found; "
"falling back to positive-only query"
)
# or raise ValueError("No embeddings found for negative samples")🧠 Learnings used
Learnt from: lanzhenw
Repo: voxel51/fiftyone PR: 7190
File: app/packages/similarity-search/src/hooks/useSimilarityPanel.ts:125-140
Timestamp: 2026-03-16T16:12:17.566Z
Learning: In `app/packages/similarity-search/src/hooks/useSimilarityPanel.ts`, `handleClone` intentionally discards `run.query` when it is not a string (i.e., image query IDs). Image query IDs are ephemeral and tied to the user's sample selection at clone time, so cloning an image run carries over only the search settings (brain_key, query_type, k, reverse, dist_field) and asks the user to re-select samples. This is intentional UX, not a bug.
Learnt from: swheaton
Repo: voxel51/fiftyone PR: 6646
File: tests/unittests/storage_tests.py:84-84
Timestamp: 2025-12-09T15:35:45.409Z
Learning: In the voxel51/fiftyone repository, when reviewing Python code that uses open(), do not suggest removing explicit mode="r" arguments even if Ruff flags UP015. The maintainers prefer explicit open mode for readability. Apply this guideline to all Python files in the repo (tests and source) where open() is used, not just a single file.
Learnt from: AdonaiVera
Repo: voxel51/fiftyone PR: 7093
File: docs/scripts/generate_plugin_docs.py:828-839
Timestamp: 2026-02-26T00:52:03.144Z
Learning: In Python code, avoid recompiling the same regex on every use. If a class uses multiple regex patterns, compile them once (e.g., in __init__ or a dedicated _compile_regex_patterns method) and reuse the compiled objects in all methods. This improves performance and consistency across the class. Apply this pattern to similarly structured classes across the repository.
Learnt from: Burhan-Q
Repo: voxel51/fiftyone PR: 7274
File: fiftyone/core/odm/ontology.py:60-64
Timestamp: 2026-04-01T13:33:16.011Z
Learning: When requesting the current UTC time in this repository, prefer `datetime.now(timezone.utc)` over `datetime.utcnow()`. `datetime.utcnow()` returns a naive `datetime` (no tzinfo) and is deprecated in Python 3.12, so it’s not equivalent to timezone-aware UTC obtained via `datetime.now(timezone.utc)`. During review, don’t suggest reverting to `datetime.utcnow()`; use `datetime.now(timezone.utc)` to ensure tz-aware UTC values.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
app/packages/similarity-search/src/components/NewSearch/NewSearch.tsx (1)
199-203: 🧹 Nitpick | 🔵 TrivialConsider extracting the middle dot unicode to a constant.
The
\u00B7separator works, but extracting it to a named constant (e.g.,SEPARATOR_DOT) would improve readability, or use a simpler|character.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/packages/similarity-search/src/components/NewSearch/NewSearch.tsx` around lines 199 - 203, Inline use of the middle-dot unicode (\u00B7) in the JSX reduces readability; define a named constant (e.g., SEPARATOR_DOT = '·' or use '|' if preferred) at the top of the NewSearch component file and replace the inline `\u00B7` usage in the template string (where form.negativeQueryIds is referenced) with that constant (keep the surrounding spacing and interpolation logic unchanged) so the separator is clear and reusable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/packages/similarity-search/src/hooks/useComputeSimilarity.ts`:
- Around line 14-20: The prompt callback calls triggerEvent with computeSimUri
which can be undefined when useFirstExistingUri finds no operator; update the
prompt function (and/or its caller) to guard against undefined computeSimUri by
returning early or no-op when isAvailable/computeSimUri is falsy — i.e., check
computeSimUri (and optionally isAvailable) inside prompt before calling
triggerEvent with panelId and params so triggerEvent never receives an undefined
operator.
---
Duplicate comments:
In `@app/packages/similarity-search/src/components/NewSearch/NewSearch.tsx`:
- Around line 199-203: Inline use of the middle-dot unicode (\u00B7) in the JSX
reduces readability; define a named constant (e.g., SEPARATOR_DOT = '·' or use
'|' if preferred) at the top of the NewSearch component file and replace the
inline `\u00B7` usage in the template string (where form.negativeQueryIds is
referenced) with that constant (keep the surrounding spacing and interpolation
logic unchanged) so the separator is clear and reusable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e34db17f-9367-480e-a8a8-39ece0a5bf3b
⛔ Files ignored due to path filters (2)
app/packages/core/package.jsonis excluded by!**/*.jsonapp/yarn.lockis excluded by!**/yarn.lock,!**/*.lock,!**/*.lock
📒 Files selected for processing (10)
app/packages/similarity-search/src/components/Home/RunActions.tsxapp/packages/similarity-search/src/components/NewSearch/NewSearch.tsxapp/packages/similarity-search/src/components/SimilarityIndex/SimilarityIndex.tsxapp/packages/similarity-search/src/components/SimilaritySearchCTA.tsxapp/packages/similarity-search/src/constants.tsapp/packages/similarity-search/src/hooks/useComputeSimilarity.tsapp/packages/similarity-search/src/hooks/useNewSearchForm.tsapp/packages/similarity-search/src/mui.tsapp/packages/similarity-search/src/utils.tsplugins/panels/similarity_search/operators.py
🔗 Related Issues
📋 What changes are proposed in this pull request?
Integreates teh newly merged alt-selected-samples feature (PR #7109 ) with the similarity search panel
useSearchSelectionsplits theselectedSamplesMap by type ("default"→ positivequeryIds,"alt"→ negativenegativeQueryIds) and wires throughuseNewSearchForm→useSearchSubmission→buildExecutionParams_compute_combined_querynow L2-normalizes the combined vector (mean(positive) - mean(negative)) for backend-agnostic correctness🧪 How is this patch tested? If it is not, please explain why.
Manual testing:
__tests__/utils.test.tscoverbuildExecutionParamswithnegativeQueryIds📝 Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
What areas of FiftyOne does this PR affect?
fiftyonePython library changesSummary by CodeRabbit
New Features
Bug Fixes
UI