Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
Refactors CortexaDB’s public API and internal terminology from namespaces to collections, and renames primary operations from remember/ask/delete_memory to add/search/delete across Rust core, Python bindings, tests, examples, and documentation.
Changes:
- Renamed core APIs (
remember→add,ask→search,delete_memory→delete) and updated scoping (namespace→collection) end-to-end. - Updated query/index/graph/storage layers to scope by collection and renamed associated types/functions.
- Migrated docs, examples, benchmarks, and test suites to the new API surface; bumped crate versions in
Cargo.lock.
Reviewed changes
Copilot reviewed 46 out of 50 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/rust/basic_usage.rs | Updates Rust example to add/search and collection-scoped batch records. |
| docs/resources/examples.md | Updates example snippets to add/search and collections terminology. |
| docs/index.md | Updates docs index link from namespaces to collections. |
| docs/guides/replay.md | Updates replay guide terminology and operation names (add/delete). |
| docs/guides/query-engine.md | Updates query engine guide to collection scoping and search. |
| docs/guides/namespaces.md | Removes namespaces guide. |
| docs/guides/embedders.md | Updates embedder guide examples to add/search. |
| docs/guides/core-concepts.md | Updates core concepts terminology to collections + new API names. |
| docs/guides/collections.md | Adds new collections guide replacing namespaces guide. |
| docs/guides/chunking.md | Updates ingestion examples to use collection= parameter. |
| docs/getting-started/quickstart.md | Updates quickstart examples to add/search and collections. |
| docs/content/docs/resources/examples.mdx | Mirrors examples updates for site content. |
| docs/content/docs/guides/replay.mdx | Mirrors replay guide updates for site content. |
| docs/content/docs/getting-started/quickstart.mdx | Mirrors quickstart updates for site content. |
| docs/content/docs/api/rust.mdx | Updates Rust API reference to add/search/delete and collection field. |
| docs/api/rust.md | Updates Rust API reference to add/search/delete and collection field. |
| docs/api/python.md | Updates Python API reference to add/search/delete and collections terminology. |
| crates/cortexadb-py/test_stress.py | Migrates stress tests to db.add. |
| crates/cortexadb-py/test_smoke.py | Migrates smoke tests to collections + add/search/delete and wrapper usage. |
| crates/cortexadb-py/src/lib.rs | Updates PyO3 bindings: rename methods, remove namespace field, map to collection APIs. |
| crates/cortexadb-py/cortexadb/replay.py | Renames replay writer op from remember to add. |
| crates/cortexadb-py/cortexadb/providers/openai.py | Updates provider docstring example to add/search. |
| crates/cortexadb-py/cortexadb/providers/ollama.py | Updates provider docstring example to add/search and cleans imports. |
| crates/cortexadb-py/cortexadb/providers/gemini.py | Updates provider docstring example to add/search. |
| crates/cortexadb-py/cortexadb/loader.py | Cleans unused typing imports. |
| crates/cortexadb-py/cortexadb/embedder.py | Updates embedder docs/examples to add/search. |
| crates/cortexadb-py/cortexadb/client.py | Refactors Python wrapper to collections + add/search/delete, replay op rename, removes legacy aliases. |
| crates/cortexadb-py/cortexadb/chunker.py | Cleans unused typing imports. |
| crates/cortexadb-py/.gitignore | Adds common build artifacts (*.whl, *.egg-info, etc.). |
| crates/cortexadb-core/tests/integration.rs | Migrates integration tests to add/search/delete and collection field. |
| crates/cortexadb-core/src/store.rs | Renames delete op + indexes embedding by collection. |
| crates/cortexadb-core/src/storage/wal.rs | Updates WAL tests for Command::Delete. |
| crates/cortexadb-core/src/storage/serialization.rs | Updates serialization test to expect collection. |
| crates/cortexadb-core/src/storage/segment.rs | Updates segment test to expect collection. |
| crates/cortexadb-core/src/query/hybrid.rs | Renames query option scoping to collection and updates graph traversal helpers. |
| crates/cortexadb-core/src/query/executor.rs | Renames executor scoping to collection and updates graph traversal helpers. |
| crates/cortexadb-core/src/index/vector.rs | Refactors vector partitions/indexing from namespace to collection. |
| crates/cortexadb-core/src/index/graph.rs | Refactors graph BFS helpers from namespace to collection and enforces collection boundaries. |
| crates/cortexadb-core/src/facade.rs | Renames public facade API to add/search/delete and updates collection scoping/filtering. |
| crates/cortexadb-core/src/engine.rs | Updates engine command handling to Command::Delete and collection terminology. |
| crates/cortexadb-core/src/core/state_machine.rs | Refactors state machine terminology/errors to collection scoping and delete rename. |
| crates/cortexadb-core/src/core/memory_entry.rs | Renames memory field namespace→collection. |
| crates/cortexadb-core/src/core/command.rs | Renames command variant to Delete and updates constructors/tests. |
| crates/cortexadb-core/src/bin/sync_bench.rs | Updates bench CLI/config from namespace to collection. |
| crates/cortexadb-core/src/bin/startup_bench.rs | Migrates bench seeding to db.add. |
| crates/cortexadb-core/src/bin/manual_store.rs | Updates manual store example query options to collection scoping. |
| crates/cortexadb-core/benches/storage_bench.rs | Updates benchmark ingestion to db.add. |
| benchmark/cortexadb_runner.py | Updates benchmark runner to add and _inner.search_embedding. |
| Cargo.lock | Bumps cortexadb-core and cortexadb-py versions to 0.1.8. |
| CHANGELOG.md | Removes changelog file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| limit = limit or kwargs.get("limit") or kwargs.get("top_k", 5) | ||
| vector = vector or kwargs.get("vector") or kwargs.get("embedding") or kwargs.get("query_vector") | ||
| collections = collections or kwargs.get("collections") or kwargs.get("namespaces") | ||
| collections = collections or kwargs.get("collections") or kwargs.get("collection") | ||
| vec = self._resolve_embedding(query, vector) | ||
|
|
||
| if collections is None: | ||
| base_hits = self._inner.ask_embedding(vec, top_k=limit, filter=filter) | ||
| base_hits = self._inner.search_embedding(vec, top_k=limit, filter=filter) | ||
| elif len(collections) == 1: | ||
| base_hits = self._inner.ask_in_collection(collections[0], vec, top_k=limit, filter=filter) | ||
| base_hits = self._inner.search_in_collection(collections[0], vec, top_k=limit, filter=filter) | ||
| else: |
There was a problem hiding this comment.
collections = ... or kwargs.get("collection") can set collections to a string when callers pass collection="agent_a". Since strings are iterable, len(collections) and collections[0] then behave incorrectly (length of the name; first character as the collection), breaking scoped search. Normalize a single collection string into a one-element list (and/or keep namespaces alias if you still want backwards compatibility).
| report["op_counts"][op_type] = report["op_counts"].get(op_type, 0) + 1 | ||
|
|
||
| try: | ||
| if op_type == "remember": | ||
| if op_type == "add": | ||
| new_id = db.add( | ||
| text=op.get("text"), | ||
| vector=op.get("embedding"), | ||
| metadata=op.get("metadata"), | ||
| collection=op.get("collection") or op.get("namespace", "default") | ||
| collection=op.get("collection") or "default" | ||
| ) | ||
| id_map[op.get("id")] = new_id | ||
| report["exported"] += 1 |
There was a problem hiding this comment.
replay() builds report using keys like checked/exported, but the documented last_replay_report schema expects total_ops, applied, skipped, failed, op_counts, and failures (see docs/api/python.md). Consider aligning the report keys/counting (e.g., total_ops == operations read, applied == ops successfully executed) and collecting failure details up to the documented limit. Also, docs describe strict=False as the default, but replay() currently defaults strict to True; aligning defaults would avoid surprising behavior for callers.
| match cmd { | ||
| Command::DeleteMemory(id) => assert_eq!(id, MemoryId(1)), | ||
| Command::Delete(id) => assert_eq!(id, MemoryId(1)), | ||
| _ => panic!("Expected DeleteMemory"), |
There was a problem hiding this comment.
The panic message in this test still says "Expected DeleteMemory" even though the variant is now Command::Delete. Updating the message will make failures less confusing.
| _ => panic!("Expected DeleteMemory"), | |
| _ => panic!("Expected Delete"), |
| // Bulk insert with 100x speedup | ||
| let last_id = db.remember_batch(records)?; | ||
| println!(" Batch finished. Last inserted ID: {}", last_id); | ||
| let last_id = db.add_batch(records)?; | ||
| println!(" Batch finished. Last inserted ID: {}", last_id.last().unwrap()); |
There was a problem hiding this comment.
db.add_batch(records) now returns a Vec<u64> of inserted IDs, but the variable is named last_id and the example prints last_id.last().unwrap(). Renaming the variable (e.g., ids) and avoiding the unwrap() (or explicitly asserting the vector is non-empty in the example) would make the sample clearer and less panic-prone if someone adapts it to an empty batch.
| for hit in hits: | ||
| mem = db.get_memory(hit.id) | ||
| print(f"[{hit.score:.3f}] {mem.content.decode()}") |
There was a problem hiding this comment.
These Python examples still use db.get_memory(hit.id), but the current Python wrapper API exposes db.get(mid) (and tests use db.get(...)). Update the docs to match the public Python API so the snippets run as-is.
| writer.add("Draft: AI agents are transforming...") | ||
|
|
||
| # Each agent queries only its own memories | ||
| planner_context = planner.search("What tsearchs are pending?") |
There was a problem hiding this comment.
Typo in example query string: "What tsearchs are pending?" looks unintended and should read "What tasks are pending?" (or similar).
| @@ -68,9 +68,9 @@ db.ingest("Long article text here...", strategy="markdown") | |||
| ### 7. Use Namespaces | |||
There was a problem hiding this comment.
Section title still says "Use Namespaces" but the example has been migrated to db.collection(...). Update the heading to "Use Collections" to match the new API terminology.
Description
Big refactor