Skip to content

Add Azure Cosmos DB (NoSQL) full-text search provider#38

Open
mhelleborg wants to merge 2 commits into
mainfrom
claude/provider-cosmosdb-k9u9fc
Open

Add Azure Cosmos DB (NoSQL) full-text search provider#38
mhelleborg wants to merge 2 commits into
mainfrom
claude/provider-cosmosdb-k9u9fc

Conversation

@mhelleborg

Copy link
Copy Markdown
Owner

Implements the Azure Cosmos DB (NoSQL API) provider — issue #28, part of the multi-provider epic #23. Mirrors the Postgres provider's structure and plugs into the shared SearchLite.Conformance suite.

What's here

  • Source/SearchLite.CosmosDBSearchManager + SearchIndex<T> on Microsoft.Azure.Cosmos 3.61. Each logical index maps to one container in a single database (default searchlite); the document is stored verbatim as JSON under a doc envelope alongside a denormalized searchText, with id and partition key both set to the document id (/id partition key — cheap single-partition point ops).
  • Full-text (GA FTS APIs): containers are provisioned with a FullTextPolicy + full-text index over /searchText; queries match with FullTextContains and rank via ORDER BY RANK FullTextScore(...) (BM25).
  • WhereClauseBuilder<T> translates every Operator into parameterized Cosmos NoSQL WHERE clauses (c.doc["Field"] accessors, ARRAY_CONTAINS for collection membership, CONTAINS/STARTSWITH/ENDSWITH + case-insensitive variants, IS_DEFINED/IS_NULL, IN), with OFFSET/LIMIT paging.
  • Tests/SearchLite.CosmosDB.Tests — concrete subclass of the shared conformance suite via a Testcontainers Cosmos emulator fixture, plus container-naming and Cosmos-SQL translation unit tests.
  • Added Microsoft.Azure.Cosmos to Directory.Packages.props.

Verification

  • Both projects build clean in Release across net8.0/net9.0/net10.0.
  • The 26 infrastructure-free unit tests pass.
  • The parallel CI matrix auto-discovers SearchLite.CosmosDB.Tests.

⚠️ Known limitations

  • Synthetic score: Cosmos exposes FullTextScore only inside ORDER BY RANK and won't let it be projected into SELECT. So ranking is correct (server-side BM25 via ORDER BY RANK), but SearchResult.Score/MaxScore are a synthesized strictly-decreasing 1/(1+rank), and MinScore filters against that synthetic value — semantics differ from Postgres ts_rank. (This is exactly the kind of cross-provider scoring divergence flagged in epic [Epic] Formalize the common search-provider interface for multi-backend support #23.)
  • Not validated live: the conformance suite needs the Cosmos emulator + Docker (unavailable locally), so FTS tokenization, partial vs. full match (IncludePartialMatches), unicode handling, and the exact result counts the shared suite asserts are unverified and may need tuning once the emulator can run in CI.
  • The SDK's build-time Newtonsoft.Json check is bypassed via AzureCosmosDisableNewtonsoftJsonCheck (the SDK still resolves its own Newtonsoft at runtime).

🤖 Generated with Claude Code


Generated by Claude Code

claude added 2 commits June 27, 2026 22:55
Implements SearchLite.CosmosDB mirroring the Postgres provider:
- One Cosmos container per index, partition key /id, document stored as
  JSON under a "doc" envelope property alongside a denormalized
  "searchText" field that the container's full-text index policy targets.
- Full-text search via FullTextContains + ORDER BY RANK FullTextScore
  (BM25) using the GA FTS APIs in Microsoft.Azure.Cosmos 3.61.0.
- WhereClauseBuilder<T> translating every Operator case into parameterized
  Cosmos NoSQL WHERE clauses (ARRAY_CONTAINS for collection membership,
  CONTAINS/STARTSWITH/ENDSWITH for string ops, IS_DEFINED/IS_NULL for null
  semantics, IN for set membership), offset/limit pagination.
- Test project reusing the shared abstract conformance suite via a
  Testcontainers Cosmos emulator fixture, plus container-naming and
  WhereClause unit tests (26 unit tests passing without infrastructure).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01AHZB8AqzqRcBEuzRurzJFf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants