feat(kb): complete phase 1A storage decoupling and tests#158
Conversation
sqhyz55
commented
Mar 16, 2026
- introduce MetadataStore/VectorIndexStore/KBWriteCoordinator contracts and LanceDB-backed defaults
- refactor RAG management and API paths to use storage abstractions instead of direct LanceDB access
- add storage factory singleton reset hook and update tests to depend on storage layer (including collection manager and collections tests)
- ensure phase 1A keeps doc_id semantics while remaining compatible with future file_id linkage
17767c1 to
f1cc176
Compare
e4ae0a0 to
2926c29
Compare
|
[ThumbsUp] |
2926c29 to
c6f5485
Compare
|
It looks like the main functionality is ready (c6f5485). Are you planning to take this out of draft to start the PR checks? |
|
A lot of conflicts to resolve. Let's focus to merge this within this week. |
OK |
rogercloud
left a comment
There was a problem hiding this comment.
Code Review: Phase 1A Storage Decoupling
Thanks for this PR -- the storage abstraction direction is sound and test coverage is improved.
I am requesting changes for 3 critical and 6 major issues:
Critical:
aggregate_collection_statsloads entire tables into Python memory (OOM risk)- Inconsistent
modelfield between write/read in ingestion (duplicate embeddings risk) list_collectionssilently drops configs whenuser_id=None
Major:
4. asyncio.run() per-collection in loop
5. 12x duplicate get_connection_from_env()
6. Auto-migration during read operations
7. Inconsistent search fallback errors
8. Hub resolution silently swallows errors
9. Tag collision risk in hub resolution
Minor (6): Arbitrary *100 multiplier, KBWriteCoordinator shell, deployment-specific script, test mock level, get_raw_connection leaks, redundant fixtures
d910b5b to
39755dd
Compare
rogercloud
left a comment
There was a problem hiding this comment.
New Review Findings (beyond existing rogercloud comments)
5 new findings identified while re-verifying all existing comments against the checked-out code:
| # | Severity | File | Summary |
|---|---|---|---|
| 1 | Major | search_sparse.py |
search_sparse_async incorrectly assumes FTS is enabled after create_index |
| 2 | Minor | search_engine.py |
create_index return parsed via fragile "advice:" string split |
| 3 | Minor | vector_manager.py |
_open_embeddings_table bypasses storage abstraction |
| 4 | Minor | search_engine.py, search_sparse.py |
Redundant get_vector_index_store() calls |
| 5 | Minor | lancedb_stores.py |
LanceDBMetadataStore.get_raw_connection returns uncached connection |
Only finding #1 needs action before merge; the rest are cleanup items for a follow-up.
- introduce MetadataStore/VectorIndexStore/KBWriteCoordinator contracts and LanceDB-backed defaults - refactor RAG management and API paths to use storage abstractions instead of direct LanceDB access - add storage factory singleton reset hook and update tests to depend on storage layer (including collection manager and collections tests) - ensure phase 1A keeps doc_id semantics while remaining compatible with future file_id linkage
- Make LanceDBVectorIndexStore connection lazy to avoid early default-dir binding - Add clear_connection_cache() to reset provider-level cached connections - Isolate LANCEDB_DIR per test and reset KB storage singleton for clean state - Add assertion test to ensure default LanceDB directory is not modified
Use Hub ID as the single source of truth for metadata, embedding rows, and table naming, and add forward-migration compatibility plus test hardening for sparse search and isolated LanceDB migration tests.
Introduce storage abstraction contracts to decouple API and management layers from direct LanceDB dependencies. This enables future backend migration while maintaining backward compatibility. Storage Layer: - Extend MetadataStore with config operations (save/get_collection_config) - Extend VectorIndexStore with aggregate and delete operations (aggregate_collection_stats, aggregate_document_stats, delete_collection_data) - Implement all contracts in LanceDBMetadataStore and LanceDBVectorIndexStore - Add factory singleton (get_kb_write_coordinator) for coordinated access API Layer (api/kb.py): - Replace direct get_connection_from_env() calls with storage abstractions - Remove manual embeddings table updates in rename_collection_api - Use get_metadata_store() and get_vector_index_store() throughout Management Layer (management/collections.py): - Refactor list_collections() to use VectorIndexStore.aggregate_collection_stats() - Refactor delete_collection() to use VectorIndexStore.delete_collection_data() - Refactor cancel_collection() to use VectorIndexStore.list_document_records() - Refactor get_document_stats() to use VectorIndexStore.aggregate_document_stats() - Remove 324 lines of direct LanceDB operations Tests: - Add 90 lines of new storage layer tests - Update multitenancy tests to mock storage abstractions - Update kb_dir API tests for new mock patterns - All 767 RAG tests passing Phase 1A Constraints Met: - Interface decoupling only (no physical database split) - doc_id maintained as primary key - Backward compatibility via get_raw_connection() on all contracts - No changes to existing data schemas
Phase 1A Part 2: Complete storage decoupling for ingestion status tracking. Core Changes: - Add StorageFactory unified class for managing all store singletons - Add IngestionStatusStore contract with sync/async methods - Implement LanceDBIngestionStatusStore with full CRUD operations - Refactor status.py to use storage abstraction layer - Add stub contracts for PromptTemplateStore and MainPointerStore Storage Abstraction Layer: - contracts.py: Add IngestionStatusStore, PromptTemplateStore, MainPointerStore - factory.py: Replace standalone functions with StorageFactory class - Thread-safe singleton pattern with double-checked locking - Unified reset_all() method for test isolation - lancedb_stores.py: Implement LanceDBIngestionStatusStore - Sync methods: write_ingestion_status, load_ingestion_status, clear_ingestion_status - Async methods: *_async variants (delegates to sync for now, true async I/O in Phase 1B) - Helper methods: _build_base_filter, _build_load_filter, _combine_filters Business Logic Refactoring: - status.py: Remove raw connection usage, use get_ingestion_status_store() - Maintain backward-compatible public API - Add async variants: write_ingestion_status_async, load_ingestion_status_async, clear_ingestion_status_async Tests: - test_status.py: Add 4 async tests (12 total tests passing) - test_write_ingestion_status_async - test_write_ingestion_status_overwrites_existing_async - test_load_ingestion_status_by_collection_async - test_clear_ingestion_status_async Related Files: - storage/__init__.py: Export new contracts and factory functions This completes Phase 2.1 of the storage decoupling plan (plan/storage-decoupling-phase2.md). Next phases will extend VectorIndexStore for index management and implement PromptTemplateStore.
- Extend VectorIndexStore contract with index management methods (should_reindex, trigger_reindex with sync/async variants) - Implement index management in LanceDBVectorIndexStore - Add comprehensive index management tests (7 tests) - Remove duplicate index functions from vector_manager.py (_should_reindex, _trigger_reindex now use storage layer) - Update tests to use new StorageFactory API - Fix threading deadlock: use RLock instead of Lock in StorageFactory Test results: - Storage tests: 22/22 passing - Vector manager tests: 38/42 passing (4 pre-existing failures)
…nified filter system - Extend FilterOperator with IS_NULL and IS_NOT_NULL for NULL value handling - Create lancedb_filter_utils.py with shared filter translation functions (translate_condition, format_value, translate_filter_expression) - Implement PromptTemplateStore contract with full CRUD operations (save_prompt_template, get_prompt_template, get_latest_prompt_template, list_prompt_templates, delete_prompt_template + async variants) - Implement LanceDBPromptTemplateStore with version management - Implement MainPointerStore contract with version control operations (set_main_pointer, get_main_pointer, list_main_pointers, delete_main_pointer + async variants) - Implement LanceDBMainPointerStore using unified FilterExpression system (handles model_tag == '' OR model_tag IS NULL for backward compatibility) - Add user_id parameter with deprecation warnings (schema migration required) - Refactor LanceDBVectorIndexStore to use shared filter utilities - Add comprehensive tests (8 new tests for PromptTemplateStore and MainPointerStore) Test results: - Storage tests: 30/30 passing
…ore decoupling Extend PromptTemplateStore and MainPointerStore contracts to support complete business logic, and refactor manager layers to fully use storage abstractions. Changes: 1. PromptTemplateStore contract (contracts.py): - Add update_metadata() - Update metadata only, keep same version/ID - Add delete_by_name() - Delete by name with is_latest management - Add get_versions_by_name() - Get all versions of a template - Add async variants for all new methods 2. LanceDBPromptTemplateStore implementation (lancedb_stores.py): - Implement update_metadata() with in-place metadata update - Implement delete_by_name() with is_latest flag management - Implement get_versions_by_name() for version listing - Fix delete_prompt_template() to update is_latest for remaining versions - Add async variants delegating to sync methods 3. prompt_manager.py refactoring: - update_prompt_template() uses update_metadata() for metadata-only updates - delete_prompt_template() uses delete_by_name() and delete_prompt_template() - list_prompt_templates() uses store abstraction directly - Remove unused imports (escape_lancedb_string, pandas) - Remove unused _deserialize_metadata() function 4. main_pointer_manager.py refactoring: - All functions use MainPointerStore abstraction - Remove get_connection_from_env() function - Remove _build_base_filter_expression() function - Remove ensure_main_pointers_table import - Remove dependencies on escape_lancedb_string and build_lancedb_filter_expression 5. Test updates: - Fix mock objects in test_lancedb_stores.py for new methods - Rewrite test_main_pointer_manager.py to use new abstraction layer - All 78 tests passing (44 prompt_manager + 25 storage + 9 main_pointer_manager) This completes Phase 2.4 requirements: - PromptTemplateStore extended to support complete business logic - MainPointerStore extended to support complete business logic - Manager layers fully refactored to use storage abstractions - No direct get_connection_from_env() usage in business logic - All tests passing
Fix ruff, mypy, and formatting issues reported by pre-commit hooks.
Changes:
1. Import fixes:
- Add missing IndexPolicy import in lancedb_stores.py
- Remove unused imports (cast, IndexPolicy, build_filter_from_dict, datetime)
- Remove unused variables (model_tag_arg, custom_policy)
2. Type safety fixes:
- Add type assertions for name parameter in prompt_manager.py
- Fix FilterExpression type invariants (use tuple instead of list)
- Add type: ignore comments for lancedb.connect_async
- Add cast() for pandas.to_dict("records") return type
3. Code cleanup:
- Remove NULL model_tag fix-up logic in LanceDBMainPointerStore.set_main_pointer()
- Remove duplicate test function (test_write_vectors_reindex_policy_configuration)
- Remove _build_base_filter() call (no longer needed)
4. Formatting:
- Apply ruff-format auto-formatting to all files
All Python linters now pass:
- ruff check: Passed
- ruff format: Passed
- mypy: Passed
- isort: Passed
- codespell: Passed
Note: npm type-check fails due to frontend issues unrelated to this work.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update test mocks and expectations to align with Phase 1A abstraction layer: - test_register_document.py: Fix mock method names (count_rows -> count_rows_or_zero) - test_multitenancy.py: Fix mock paths (status.get_metadata_store -> factory.get_metadata_store) - test_list_candidates.py: Update expected filter expression to include user_id - test_vector_manager.py: Update tests to expect multiple calls after refactoring: - Batch processing tests: 3 calls for upsert_embeddings (batch_size=2) - Chunk reading tests: 2 calls for count_rows_or_zero and iter_batches (chunks + embeddings) - test_index_manager.py: Skip 5 tests for functionality moved to VectorIndexStore All RAG flow tests now passing (781 passed, 6 skipped)
Add autouse fixture to mock _ensure_schema_fields globally, avoiding 'Mock object is not iterable' errors when tests call schema management functions. Also fixes: - test_metadata_store_save_collection_config: add SimpleNamespace schema mock - test_metadata_store_get_collection_config_success: fix iloc[0] access pattern mock - test_vector_store_list_document_records_filters_and_maps: add schema mock All 25 storage tests now passing.
…gging Add structured logging utilities and comprehensive test coverage for Phase 1A storage layer, improving observability and code quality. **Structured Logging (logging_utils.py):** - log_operation: context manager for timing and structured output - log_async_operation: decorator for async operations with automatic timing - log_audit: audit event logging for security/compliance tracking - log_performance: performance metrics logging with optional value parameter **Performance & Audit Logs (lancedb_stores.py):** - search_vectors_async: log top_k, vector_dim, result_count - iter_batches_async: log batch_size, columns_provided - count_rows_async: log row_count, has_filter - upsert_documents_async: log record_count - list_document_records: add audit log for data access tracking **Test Coverage Improvements (+9 tests, 32→41 in test_lancedb_stores.py):** - Async method tests: search_vectors_async, search_fts_async, iter_batches_async, count_rows_async, upsert_documents_async, upsert_chunks_async, upsert_embeddings_async - Core upsert tests: upsert_documents, upsert_parses, upsert_chunks (including edge cases) - Error handling tests: table_not_found, search_failure, invalid_data, invalid_columns **Test Results:** - All 814 RAG tests passing (805→814, +9 new tests) - Async method coverage significantly improved (from 2/30 baseline) - Overall test coverage ~85%
…ive tests - Add get_vector_dimension() method to VectorIndexStore contract - Abstract vector dimension retrieval from table schema - Implement in LanceDBVectorIndexStore with sync/async variants - Refactor rebuild_collection_metadata() to use abstraction layer - Replace get_raw_connection() with count_rows_or_zero() - Replace direct schema access with get_vector_dimension() - Remove legacy .to_pandas() call - Delete unused _get_connection() method in CollectionManager - Add 12 new tests for get_vector_dimension, list_table_names, and rebuild_collection_metadata - Fix mypy error with cast(int, vector_type.list_size) - Fix logging configuration pollution in test_real_ingestion.py - Remove deployment-specific script set_nanwang_embedding_model_id.py
…test failures This commit addresses the remaining unresolved review comments from PR xorbitsai#158 and fixes associated test failures. All critical and major issues are now resolved. - **Comment xorbitsai#2**: Fix inconsistent model field write vs read - Unify write and read paths to use `embedding_config.id` as single source of truth - Eliminates duplicate embeddings caused by model ID mismatch - **Comment xorbitsai#3**: Fix user_id=None losing configs in admin mode - Admin mode (is_admin=True, user_id=None) now keeps uid=None - Enables admin workflows to load all collection configs correctly - **Comment xorbitsai#4**: Implement thread isolation for async execution - Add `_run_in_separate_loop()` to handle async in sync contexts - Prevents "event loop already running" errors in FastAPI handlers - Automatically detects execution context and chooses appropriate strategy - **Comment xorbitsai#6**: Remove auto-migration during read operations - Eliminate `while True` migration loop in `_open_embeddings_table()` - Use legacy fallback only; explicit migration via `migrate_embeddings_table()` - Prevents inline migration without locks or concurrency protection - **Comment xorbitsai#7**: Unify error handling between search_engine and search_sparse - Update `search_sparse.py` to re-raise primary_exc like `search_engine.py` - Ensures consistent error messages for "table not found" scenarios - **Comment xorbitsai#8**: Add structured logging for Hub resolution failures - Replace silent exception swallowing with structured logging - Log error_type, error_message, fallback_behavior, and impact - Applies to both `collection_manager.py` and `migration_utils.py` - **Comment xorbitsai#13**: Fix mock tests to use real storage layer - Update test patches from `get_connection_from_env` to correct paths - Tests now use real storage implementations via `conftest.py` fixtures - Add `temp_lancedb_dir`, `real_store`, `manager_with_real_store` fixtures - 800+ tests passing (up from 757 before fixes) - All mock path issues resolved - Remaining 10 test failures are due to `owners` field (known issue from later development) **Core Implementation:** - `collection_manager.py`: Add `_run_in_separate_loop()` and structured logging - `collections.py`: Fix user_id=None handling for admin mode - `document_ingestion.py`: Unify model field usage - `search_sparse.py`: Match error handling with `search_engine.py` - `migration_utils.py`: Add structured logging - `vector_manager.py`: Remove auto-migration code **Test Files:** - Multiple test files updated for correct mock paths - Add `management/conftest.py` for real storage fixtures - Update 30+ test files to use `get_vector_store_raw_connection` **Storage Layer:** - `lancedb_stores.py`: Maintain `get_connection_from_env` to avoid circular imports - Test patches updated to target correct module paths
…S detection This commit addresses New Finding 1 and New Finding 2 from PR xorbitsai#158 review: - Finding 1: Fix async FTS detection logic to match sync version - Finding 2: Replace fragile string parsing with structured IndexResult **Problem:** `search_sparse_async` incorrectly assumed FTS is enabled when `create_index()` succeeds, but it may only create vector index, not FTS. **Solution:** - Add `fts_enabled` field to IndexResult dataclass - Update `create_index()` to return IndexResult with actual FTS status - Check actual FTS index status using `table.list_indices()` - Update async version to use `index_result_obj.fts_enabled` like sync version **Problem:** `create_index()` returned ad-hoc string format `"status advice: message"` parsed via fragile string split. **Solution:** - Create IndexResult Pydantic model in core/schemas.py - Update VectorIndexStore.create_index() contract to return IndexResult - Add fields: status, advice, fts_enabled - Update all call sites to use structured field access - Remove redundant `get_vector_index_store()` calls (Finding 4) - Fix `get_raw_connection()` caching inconsistency (Finding 5) - All search_sparse tests passing (10/10) - All search_dense tests passing - Tests now use IndexResult objects instead of string mocks **Core Implementation:** - `core/schemas.py`: Add IndexResult Pydantic model - `storage/contracts.py`: Update create_index contract, add IndexResult import - `retrieval/search_engine.py`: Use IndexResult, remove redundant call - `retrieval/search_sparse.py`: Use IndexResult, fix async FTS detection - `storage/lancedb_stores.py`: Return IndexResult, check actual FTS status, fix caching **Test Files:** - Update all mock `create_index.return_value` to return IndexResult - Add IndexResult imports to test files
This commit completes several Phase 1A improvements to fully decouple
the retrieval layer from LanceDB-specific implementations:
1. Added convenience methods to VectorIndexStore contract:
- search_vectors_by_model(): sync vector search by model_tag
- search_vectors_by_model_async(): async vector search by model_tag
- search_fts_by_model_async(): async FTS search by model_tag
These methods internally combine open_embeddings_table() with the
respective search operations, providing a simpler one-step API for the
most common use case of searching by model_tag.
2. Updated retrieval layer to use new convenience methods:
- search_dense_engine (sync/async): now uses search_vectors_by_model
- search_sparse_async: now uses search_fts_by_model_async
This eliminates the previous two-step pattern of opening the table first
then passing the table_name to search functions.
3. Removed legacy IndexManager code:
- Deleted vector_storage/index_manager.py (254 lines)
- Removed _open_embeddings_table, validate_embed_model, and
get_stored_vector_dimension functions from vector_manager.py
- Updated vector_storage/__init__.py exports
- Removed corresponding tests (test_index_manager.py and related tests)
4. Legacy fallback logic now centralized:
- All table opening and legacy fallback logic is now handled by
VectorIndexStore.open_embeddings_table()
- This provides a single source of truth for table name resolution
- Both sync and async code paths benefit from unified implementation
5. Simplified validation:
- validate_query_vector no longer requires database connection
- Dimension validation is now handled by the storage abstraction layer
- Removed conn parameter and DB-dependent validation logic
Changes:
- storage/contracts.py: Added 3 convenience methods with default implementations
- storage/lancedb_stores.py: Added open_embeddings_table() implementation
- retrieval/search_engine.py: Updated to use search_vectors_by_model
- retrieval/search_sparse.py: Updated to use convenience methods
- retrieval/search_dense.py: Removed raw connection usage for validation
- vector_storage/*: Removed legacy IndexManager and related functions
- tests/*: Updated tests to use new abstraction layer, removed deleted function tests
Net changes: +439, -1515 lines across 19 files
…ch issues
This commit completes the test adaptation for Phase 1A storage decoupling
and fixes critical issues in sparse search implementation.
- **Problem**: search_sparse was pre-transforming model_tag with "embeddings_"
prefix, then open_embeddings_table would add it again, causing
"embeddings_embeddings_model" table name errors.
- **Fix**: Removed pre-transformation in search_sparse; open_embeddings_table
now handles all prefix logic internally.
- **Impact**: Sparse search now correctly resolves table names for both
Hub ID and legacy provider-based naming.
- **Problem**: create_index() in readonly mode returned fts_enabled=False
without checking if FTS index already existed.
- **Fix**: In readonly mode, open table and check existing FTS index status
before returning IndexResult.
- **Impact**: Readonly queries can now correctly detect and use existing
FTS indexes, improving test coverage and production query accuracy.
- Updated test mocks to use search_vectors_by_model() instead of
raw connection pattern.
- Removed build_filter_expression assertions (now called inside
abstraction layer).
- Fixed validate_query_vector call expectations (no longer passes
model_tag parameter).
- Added `import unittest` for unittest.mock.ANY usage.
- Updated tests to patch get_vector_index_store instead of
get_vector_store_raw_connection.
- Fixed open_embeddings_table return value expectations (now returns
tuple (table, table_name)).
- Updated model_tag assertions to expect untransformed values
(abstraction layer handles prefix).
- Fixed test_search_sparse_database_error to patch resolve_embedding_adapter
in correct location (utils.model_resolver).
- Updated all create_index mock returns to return IndexResult objects
instead of strings.
- Fixed test_write_vectors_index_status_aggregation to use
IndexResult objects in side_effect.
- test_search_dense.py: 13 passed
- test_search_sparse.py: 10 passed
- test_collections.py: 8 passed
- test_vector_manager.py: 37 passed (1 skipped)
- test_document_search.py: 8 passed
- **Total: 76 passed, 1 skipped**
```python
model_tag = f"embeddings_{to_model_tag(model_tag)}" # "embeddings_test_model"
table, model_tag = vector_store.open_embeddings_table(model_tag)
```
```python
table, actual_table_name = vector_store.open_embeddings_table(model_tag)
```
```python
if readonly:
return IndexResult(
status="readonly",
advice=...,
fts_enabled=False, # Always False!
)
```
```python
if readonly:
fts_enabled = False
try:
table = conn.open_table(table_name)
indexes = table.list_indices()
fts_enabled = any(
idx.index_type == "FTS" and "text" in idx.columns
for idx in indexes
)
except Exception as e:
logger.debug("Unable to check FTS index status: %s", e)
return IndexResult(
status="readonly",
advice=...,
fts_enabled=fts_enabled, # Actual status!
)
```
This commit addresses the remaining issues from PR xorbitsai#158 review: 1. Exception Handling Consistency: - Add try-except block to search_dense (Sync) to return structured error responses - All four search entry points now have unified exception handling pattern: * search_dense (Sync) - now returns DenseSearchResponse with status="failed" * search_dense_async (Async) - already had exception handling * search_sparse (Sync) - already had exception handling * search_sparse_async (Async) - already had exception handling 2. Test Mock Level Improvements: - Replace mock-based tests with real storage tests in TestSyncFunctions - Add integration test for rebuild_collection_metadata - Update multitenancy tests to use CollectionManager instead of non-existent register_collection - All tests now verify complete data flow through storage layer 3. Verified Fixes: - Model field read/write consistency: both paths use embedding_config.id - FTS detection: uses IndexResult.fts_enabled instead of assuming create_index success - IndexResult: structured return type replaces fragile string parsing All 52 related tests pass successfully.
39755dd to
b325783
Compare
…nections The original test had a design flaw where all threads shared the same LanceDB connection object, which is not thread-safe and causes transaction conflicts. Changes: 1. Removed threading lock from schema_manager.py (wrong approach) 2. Fixed test_concurrent_ensure_collection_metadata_table_is_safe to use separate connections for each thread This correctly tests table creation idempotency without hitting LanceDB's threading limitations.
Fixed a typo where 'last_accessed' was used instead of 'last_accessed_at' in the ensure_collection_metadata_table function. This caused the test_ensure_schema_fields_idempotency test to fail because: 1. First call created table with 'last_accessed' field 2. Second call tried to add 'last_accessed_at' field (different name) 3. LanceDB's schema alignment failed due to field name mismatch The fix ensures field name consistency across the schema definition. All 16 schema migration tests now pass successfully.
rogercloud
left a comment
There was a problem hiding this comment.
Re-Review: All Previous Comments Addressed
I've re-verified all 19 review comments against the latest code.
Critical (3/3 FIXED)
aggregate_collection_statsOOM → usesiter_batches()withcolumns=["collection"]- Inconsistent
modelfield write/read → both paths useembedding_config.id user_id or 0losing configs →is_adminanduser_id=Nonehandled explicitly
Major (6/6 FIXED)
asyncio.run()per collection →_run_in_separate_loop()with per-thread event loopget_connection_from_envduplicated in 12+ files → only inlancedb_stores.pynow- Auto-migration during read → explicit
migrate_embeddings_table()method - Inconsistent error handling in search → unified storage abstraction
- Hub resolution silently swallows errors → logs warning with full context
search_sparse_asyncFTS assumption → usescreate_index().fts_enabledchecking actual status
Minor (8/10 FIXED, 1 not fixed, 1 waived)
10-12, 15-19: All fixed (tag collision via register_tag_mapping(), structured IndexResult, fixture consolidation, etc.)
14: Tests mock at module level — minor, can be a follow-up
13: One-off migration script — acceptable
LGTM, all critical and major issues resolved.
|
congratulations |
Can you join the force for the next steps as we discussed before? |
Yeah, I’ll review the updates from recent weeks to evaluate the functions and module boundaries. After that, I'll start working on adding Milvus support as discussed. |
Following up on the Milvus step after #158. Opened #382 for the narrow path:
This should stay orthogonal to #219. |