Fix ObjectBox hardcoded 768 dimensions#46
Conversation
- Updated `VectorIndex` interface to include `dimension` property. - Refactored `ObjectBoxVectorIndex` to support 384 and 768 dimensions using a factory pattern and dimension-specific entities (`ObxVectorDoc` and `ObxVectorDoc384`). - Updated `MemoryGraph` to automatically use the correct dimension from the `EmbeddingsAdapter`. - Regenerated ObjectBox code to include the new entity. - Added comprehensive tests for multiple vector dimensions.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThe PR adds configurable vector dimensionality support to ObjectBoxVectorIndex, addressing hardcoded 768-dimension limitations. It introduces a new Changes
Sequence Diagram(s)sequenceDiagram
participant MemoryGraph
participant EmbeddingsAdapter
participant ObjectBoxVectorIndex
participant VectorIndexImpl as Dimension-Specific Implementation
participant ObjectBoxStore
MemoryGraph->>EmbeddingsAdapter: query dimension property
EmbeddingsAdapter-->>MemoryGraph: return dimension (384 or 768)
MemoryGraph->>ObjectBoxVectorIndex: factory call with dimension
ObjectBoxVectorIndex->>ObjectBoxVectorIndex: dispatch to correct implementation
alt dimension == 768
ObjectBoxVectorIndex->>VectorIndexImpl: create _ObjectBoxVectorIndex768
else dimension == 384
ObjectBoxVectorIndex->>VectorIndexImpl: create _ObjectBoxVectorIndex384
else unsupported
ObjectBoxVectorIndex-->>ObjectBoxVectorIndex: throw ArgumentError
end
VectorIndexImpl->>ObjectBoxStore: initialize with dimension-specific entity
VectorIndexImpl-->>MemoryGraph: return configured index
MemoryGraph->>VectorIndexImpl: semantic search with embeddings
VectorIndexImpl->>VectorIndexImpl: validate embedding length vs dimension
VectorIndexImpl->>ObjectBoxStore: query with dimension-specific properties
ObjectBoxStore-->>VectorIndexImpl: return results
VectorIndexImpl-->>MemoryGraph: return search results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for multiple vector dimensions (specifically 384 and 768) within the ObjectBox vector index implementation. It refactors the ObjectBoxVectorIndex into a factory-based architecture with a base class and specific entity implementations for different dimensions, ensuring type safety and correct HNSW configuration. The changes also include updates to the MemoryGraph to automatically select the appropriate index based on the embedding adapter's dimensions and a new test suite to verify compatibility. Review feedback highlights opportunities to improve performance by removing redundant toList() calls on Float32List during document insertion and search operations.
| vector: vec.toList(growable: false), | ||
| ); | ||
| _box.query(_docKeyProperty.equals(key)).build().findFirst(); | ||
| final entity = _createEntity(key, content, vec.toList(growable: false)); |
There was a problem hiding this comment.
Calling toList() on a Float32List creates a new List<double> and copies all elements. Since Float32List already implements List<double>, this conversion is unnecessary and impacts performance during document insertion.
| final entity = _createEntity(key, content, vec.toList(growable: false)); | |
| final entity = _createEntity(key, content, vec); |
| .query(ObxVectorDoc_.vector | ||
| .nearestNeighborsF32(q.toList(growable: false), topK)) | ||
| .query(_vectorProperty.nearestNeighborsF32( | ||
| q.toList(growable: false), topK)) |
There was a problem hiding this comment.
The ObjectBox vector index previously had a hardcoded dimension of 768, which caused crashes when using on-device models like MiniLM-L6-v2 (384 dimensions). This PR makes the dimension configurable.
Key changes:
VectorIndex: Addedint get dimension.ObjectBoxVectorIndex: Now an abstract class with a factory. It usesObxVectorDocfor 768-dim andObxVectorDoc384for 384-dim.MemoryGraph: Passes the adapter's dimension to the vector index during initialization.test/vector_dimension_test.dartto verify the fix.Fixes #41
PR created automatically by Jules for task 16529490341314575802 started by @iberi22
Summary by CodeRabbit
Release Notes
New Features
Tests
Chores