Skip to content

feat: add meilisearch search plugin#14

Open
Ryrahul wants to merge 17 commits into
vendurehq:mainfrom
Ryrahul:feat/add-meilisearch-plugin
Open

feat: add meilisearch search plugin#14
Ryrahul wants to merge 17 commits into
vendurehq:mainfrom
Ryrahul:feat/add-meilisearch-plugin

Conversation

@Ryrahul
Copy link
Copy Markdown

@Ryrahul Ryrahul commented Mar 10, 2026

Add Meilisearch-powered search plugin as a drop-in replacement for the default search. Supports full-text search with typo tolerance, synonyms, stop words, AI hybrid search (semantic + keyword) via OpenAI/HuggingFace/ Ollama/REST embedders, faceted search, price range filtering, similar document recommendations, and custom product/variant field mappings.

Also adds mock data and updated populate script for dev server testing.

@Ryrahul Ryrahul marked this pull request as draft March 11, 2026 06:05
@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented Mar 11, 2026

E2E test remaining

@Ryrahul Ryrahul marked this pull request as ready for review April 3, 2026 18:41
@vendure-developer-hub
Copy link
Copy Markdown

vendure-developer-hub Bot commented Apr 3, 2026

Community PluginsView preview

2dcb987

@grolmus
Copy link
Copy Markdown

grolmus commented May 11, 2026

@michaelbromley — pulled this branch locally to get a feel for the scope. The plugin itself is in good shape on the keyword-search side, but the AI/hybrid-search surface is non-trivial and I'd like your call on it before we go further. Summary below; happy to take any direction.

(Separately, the PR also needs a rebase — dev-server/dev-config.ts conflicts with main — and the build is failing on a single lint error in meilisearch.service.ts:608. Both are quick fixes and orthogonal to the scope question.)

cc @Ryrahul — thanks for the very thorough work here. Not asking for changes yet; the comments below are mostly questions for the maintainers about how broad the AI surface should be in a v1 community plugin.


AI/embedder surface — what's actually in the PR

The plugin exposes a configurable embedder layer that forwards to one of five sources:

source: 'openAi' | 'huggingFace' | 'ollama' | 'rest' | 'userProvided'

Per-embedder config (in AiSearchConfig.embedders: Record<string, EmbedderConfig>):

  • model, dimensions, documentTemplate (Liquid), documentTemplateMaxBytes
  • apiKey (required for OpenAI; optional for others)
  • url (required for ollama and rest)
  • request / response / headers (free-form Record<string, any> for rest)

Two consumer-facing capabilities:

  1. Hybrid search — every regular search() call gains a semanticRatio (0.0 keyword-only → 1.0 semantic-only, default 0.5). Pushed straight to Meilisearch's hybrid endpoint.
  2. similarDocuments API — a "More like this / customers also viewed" recommender that takes a document id and runs index.searchSimilarDocuments({ id, embedder, ... }).

Important: we don't call OpenAI/HF/Ollama ourselves. Meilisearch is the integration point; credentials are forwarded to Meilisearch which calls the upstream provider. The plugin's job is to pass config through and shape request/response.

Things to weigh

  1. Blast radius is bounded. Because the plugin doesn't hold the AI vendor surface, an API change at OpenAI/HF lands as a Meilisearch issue, not ours. We're not vendor-locked into any single embedder.

  2. Secret handling is the user's problem. apiKey: string is plaintext-forwarded to Meilisearch. No envelope encryption, no rotation hook, no secret-store integration. README would need to be unambiguous: load from env or secret manager, rotate on Meilisearch restart. Standard, worth saying explicitly.

  3. rest source is an open passthrough. request: Record<string, any> and headers: Record<string, string> mean any HTTP API can be wired. From our side it's fine, but the security surface (e.g. pointing it at an internal service with weak auth) is on the user.

  4. No usage-cost guardrails. Embeddings are generated for every indexed document; hybrid search runs on every query. For a large catalog with text-embedding-3-large this is real money. The plugin has no throttling, batching config, or "AI off in dev" switch beyond "don't configure ai.embedders." Suggest a doc warning with rough cost-per-1k-docs, and possibly an env-aware default.

  5. No telemetry / audit hooks. Users who want to log "which queries used semantic, what semanticRatio was applied, how often does the keyword fallback fire" have to wrap the resolver themselves. Probably acceptable for a v1; a richer SearchEvent payload exposing AI params would be a small addition later.

  6. Fallback on embedder failure is silent. In meilisearch.service.ts around line 218, when Meilisearch can't find the configured embedder during a search (e.g. embedder settings absent during a reindex-swap window), the plugin retries without hybrid. A warning is logged; no metric. So a misconfiguration silently degrades AI search to keyword and you only see it in logs.

  7. similarDocuments is a soft no-op when AI is unconfigured. It logs a warning and returns { items: [], totalItems: 0 } rather than throwing. If a frontend ships a "Similar products" carousel and AI config is misconfigured in prod, the carousel silently renders empty. Probably should throw or return an error result type a frontend can branch on.

  8. Maintenance surface. No new transitive deps beyond meilisearch (^0.55.0). Config surface in options.ts is 811 lines, ~150 of which are AI/embedder. Every Meilisearch release touching the embedder API may need a tracking update — bounded but real.

  9. No tests covering the AI paths. e2e/meilisearch-plugin.e2e-spec.ts (749 lines) covers indexing + full-text but not hybrid search or similarDocuments. Real gap — hybrid is the main new capability and there's no regression guard. Doable without external network using the userProvided embedder mode.

  10. AI is fully opt-in. Users who don't configure ai.embedders get a regular Meilisearch plugin; isAiSearchEnabled gates every AI-touching path. So the maintenance burden is bounded — if we ever want to remove AI, the keyword side keeps working.

My take

The AI bit is a thin passthrough to Meilisearch's native vector capability, not a homegrown embedding stack. Risk-wise it's defensible. What I'd want before merge:

  • A userProvided e2e exercising the hybrid path end-to-end (no external network).
  • README section explicit about secret handling, cost, and the userProvided escape hatch for testing.
  • similarDocuments should error or return a result type a frontend can branch on, not silently return empty.
  • Decision on whether AI lives in the same plugin or a sibling @vendure-community/meilisearch-ai-plugin. Either is defensible.

Curious what you think on scope — keep AI in v1, split it out, or trim it down to just hybrid search and drop similarDocuments for a later release?

@michaelbromley
Copy link
Copy Markdown
Member

Hey @Ryrahul - before we proceed here, can you give me some background around your use of this plugin and the degree to which you or your company intend to maintain it?

@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 11, 2026

@michaelbromley — pulled this branch locally to get a feel for the scope. The plugin itself is in good shape on the keyword-search side, but the AI/hybrid-search surface is non-trivial and I'd like your call on it before we go further. Summary below; happy to take any direction.

(Separately, the PR also needs a rebase — dev-server/dev-config.ts conflicts with main — and the build is failing on a single lint error in meilisearch.service.ts:608. Both are quick fixes and orthogonal to the scope question.)

cc @Ryrahul — thanks for the very thorough work here. Not asking for changes yet; the comments below are mostly questions for the maintainers about how broad the AI surface should be in a v1 community plugin.

AI/embedder surface — what's actually in the PR

The plugin exposes a configurable embedder layer that forwards to one of five sources:

source: 'openAi' | 'huggingFace' | 'ollama' | 'rest' | 'userProvided'

Per-embedder config (in AiSearchConfig.embedders: Record<string, EmbedderConfig>):

  • model, dimensions, documentTemplate (Liquid), documentTemplateMaxBytes
  • apiKey (required for OpenAI; optional for others)
  • url (required for ollama and rest)
  • request / response / headers (free-form Record<string, any> for rest)

Two consumer-facing capabilities:

  1. Hybrid search — every regular search() call gains a semanticRatio (0.0 keyword-only → 1.0 semantic-only, default 0.5). Pushed straight to Meilisearch's hybrid endpoint.
  2. similarDocuments API — a "More like this / customers also viewed" recommender that takes a document id and runs index.searchSimilarDocuments({ id, embedder, ... }).

Important: we don't call OpenAI/HF/Ollama ourselves. Meilisearch is the integration point; credentials are forwarded to Meilisearch which calls the upstream provider. The plugin's job is to pass config through and shape request/response.

Things to weigh

  1. Blast radius is bounded. Because the plugin doesn't hold the AI vendor surface, an API change at OpenAI/HF lands as a Meilisearch issue, not ours. We're not vendor-locked into any single embedder.
  2. Secret handling is the user's problem. apiKey: string is plaintext-forwarded to Meilisearch. No envelope encryption, no rotation hook, no secret-store integration. README would need to be unambiguous: load from env or secret manager, rotate on Meilisearch restart. Standard, worth saying explicitly.
  3. rest source is an open passthrough. request: Record<string, any> and headers: Record<string, string> mean any HTTP API can be wired. From our side it's fine, but the security surface (e.g. pointing it at an internal service with weak auth) is on the user.
  4. No usage-cost guardrails. Embeddings are generated for every indexed document; hybrid search runs on every query. For a large catalog with text-embedding-3-large this is real money. The plugin has no throttling, batching config, or "AI off in dev" switch beyond "don't configure ai.embedders." Suggest a doc warning with rough cost-per-1k-docs, and possibly an env-aware default.
  5. No telemetry / audit hooks. Users who want to log "which queries used semantic, what semanticRatio was applied, how often does the keyword fallback fire" have to wrap the resolver themselves. Probably acceptable for a v1; a richer SearchEvent payload exposing AI params would be a small addition later.
  6. Fallback on embedder failure is silent. In meilisearch.service.ts around line 218, when Meilisearch can't find the configured embedder during a search (e.g. embedder settings absent during a reindex-swap window), the plugin retries without hybrid. A warning is logged; no metric. So a misconfiguration silently degrades AI search to keyword and you only see it in logs.
  7. similarDocuments is a soft no-op when AI is unconfigured. It logs a warning and returns { items: [], totalItems: 0 } rather than throwing. If a frontend ships a "Similar products" carousel and AI config is misconfigured in prod, the carousel silently renders empty. Probably should throw or return an error result type a frontend can branch on.
  8. Maintenance surface. No new transitive deps beyond meilisearch (^0.55.0). Config surface in options.ts is 811 lines, ~150 of which are AI/embedder. Every Meilisearch release touching the embedder API may need a tracking update — bounded but real.
  9. No tests covering the AI paths. e2e/meilisearch-plugin.e2e-spec.ts (749 lines) covers indexing + full-text but not hybrid search or similarDocuments. Real gap — hybrid is the main new capability and there's no regression guard. Doable without external network using the userProvided embedder mode.
  10. AI is fully opt-in. Users who don't configure ai.embedders get a regular Meilisearch plugin; isAiSearchEnabled gates every AI-touching path. So the maintenance burden is bounded — if we ever want to remove AI, the keyword side keeps working.

My take

The AI bit is a thin passthrough to Meilisearch's native vector capability, not a homegrown embedding stack. Risk-wise it's defensible. What I'd want before merge:

  • A userProvided e2e exercising the hybrid path end-to-end (no external network).
  • README section explicit about secret handling, cost, and the userProvided escape hatch for testing.
  • similarDocuments should error or return a result type a frontend can branch on, not silently return empty.
  • Decision on whether AI lives in the same plugin or a sibling @vendure-community/meilisearch-ai-plugin. Either is defensible.

Curious what you think on scope — keep AI in v1, split it out, or trim it down to just hybrid search and drop similarDocuments for a later release?

Yeah, I think separating the AI/vector functionality into an optional companion plugin could make sense for the initial v1.
Something like a dedicated meilisearch-ai layer also gives us room to iterate independently on embeddings, hybrid search, recommendations, provider support, telemetry, etc. without increasing the surface area of the base plugin too early.

The core Meilisearch plugin already provides solid value with indexing + keyword/full-text search, while the AI side introduces a much broader surface area (embedders, provider configs, hybrid ranking, recommendations, cost/ops concerns, etc.).

Keeping those concerns isolated in something like a meilisearch-ai plugin might make the maintenance and adoption story cleaner, while still allowing the current architecture and AI work to be reused almost as-is.

@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 11, 2026

Hey @Ryrahul - before we proceed here, can you give me some background around your use of this plugin and the degree to which you or your company intend to maintain it?

Hey @michaelbromley , we do use this plugin in one of our core client projects, which is partly why I ended up investing quite a bit into extending it.

I can’t really speak on behalf of the company regarding long-term ownership commitments, but personally I’m happy to continue maintaining and contributing to it going forward. Even outside of immediate project needs, I’d still be able to spare time for fixes, compatibility updates, and maintenance around the plugin itself.

Ryrahul added 7 commits May 12, 2026 00:50
Add Meilisearch-powered search plugin as a drop-in replacement for the
default search. Supports full-text search with typo tolerance, synonyms,
stop words, AI hybrid search (semantic + keyword) via OpenAI/HuggingFace/
Ollama/REST embedders, faceted search, price range filtering, similar
document recommendations, and custom product/variant field mappings.

Also adds mock data and updated populate script for dev server testing.
- Revert dev-server config to match main, keep only commented meilisearch entries
- Change output dir from dist/ to lib/ to match other plugins
- Align peerDependencies and devDependencies versions
- Add provenance, rimraf, typescript to match elasticsearch-plugin
- Add CHANGELOG.md for initial 1.0.0 release
- Register meilisearch-plugin in docs pipeline and generate docs
- Update .gitignore to match repo convention
- Add comprehensive e2e test suite (51 tests) mirroring elasticsearch plugin
- Fix index swap to not pass rename:false which prevented settings (embedders)
  from being carried over during reindex
- Add graceful fallback to keyword search when AI embedder is temporarily
  unavailable during reindex swap window
- Fix groupByProduct totalItems using facetDistribution for accurate counts
- Add collectionIds/collectionSlugs (plural) filter support
- Fix grouped facetValues/collections to use product-level fields with distinct
- Add e2e/watch scripts to package.json
These files were added for local testing but are not needed by the plugin.
The dev-server uses the punchout-gateway fixture data from main.
Remove all AI/embedder/hybrid-search/similarDocuments code to ship a
clean keyword-search-only Meilisearch plugin for v1. The AI layer can
be added back later as a companion plugin.

Removed:
- EmbedderConfig and AiSearchConfig interfaces from options
- Hybrid search params injection and fallback logic from service
- similarDocuments method, resolver, and GraphQL schema
- AI embedder configuration from index setup
- All AI-related exports, docs, and README sections
… fields

- Synonyms are now automatically expanded to be bidirectional so users
  only need to define e.g. laptop: ['notebook'] and the reverse mapping
  is generated automatically.
- Add formattedProductName and formattedDescription fields to SearchResult
  to expose Meilisearch highlight/crop data to frontends.
@Ryrahul Ryrahul force-pushed the feat/add-meilisearch-plugin branch from e586862 to d8dfa21 Compare May 11, 2026 20:03
Ryrahul added 2 commits May 12, 2026 01:50
- Document bidirectional synonym expansion behavior
- Add Highlighting & Cropping section with formattedProductName/formattedDescription usage
- Update synonyms JSDoc to mention auto-bidirectional expansion
- Regenerate docs and manifest
@grolmus
Copy link
Copy Markdown

grolmus commented May 21, 2026

@Ryrahul thanks for confirming on the maintenance side — that gives us enough to move forward. We can park the AI/embedder layer for a sibling meilisearch-ai plugin as discussed, and focus this PR on the keyword/full-text core.

I pulled the branch and ran a thorough code review on the non-AI portions. Findings below — please work through the blocking items before we merge, and consider the non-blocking ones. Everything below is against the core (keyword) code; AI paths were intentionally skipped per the scope agreement.

Summary

The core implementation is in solid shape for a first-time community PR of this size, and it leans on @vendure/elasticsearch-plugin as a template, which is the right move. That said, there are several correctness bugs around reindex failure handling, multi-channel deletes, and filter-string building that need to be addressed before merge.

Blocking issues

  1. indexer.controller.ts:299-308 — reindex swap failure silently leaves stale data live.
    The catch logs, deletes the temp index, and then the function continues to Logger.verbose('Completed reindexing!') — so the job resolves as success while the primary index is still stale. Please rethrow after cleanup so the job fails loudly.

  2. indexer.controller.ts:282-298 — first-time reindex has a non-atomic gap between temp populate and swap.
    On first-time indexing, createIndex(primaryIndexUid) is called after the temp index is fully populated, then swapIndexes runs. There's no configureIndex call on the new primary before the swap, and the live index briefly has no documents. Please verify against Meilisearch docs that the swap actually moves settings + filterable/searchable/sortable attributes; if not, apply settings explicitly post-swap.

  3. indexer.controller.ts:551-596deleteProductOperations channel scoping is wrong.
    The product is loaded with andWhere('channel.id = :channelId', { channelId: ctx.channelId }). If the deletion runs from a ctx whose channel isn't one the product belongs to, the query returns undefined and the function exits early — the product is never deleted from the index. Either drop the channel filter on the load (you're already iterating across all channels in the delete loop), or scope the delete loop to ctx.channelId only. The elasticsearch-plugin loads without the channel filter for this case.

  4. indexer.controller.ts:185-192deleteVariants doesn't go through asyncQueue.
    updateVariants uses this.asyncQueue.push(...); deleteVariants does not. That allows a delete to race an in-flight update for the same product. Wrap the body in asyncQueue.push to match.

  5. meilisearch.service.ts:118 (and lines 499, 502, 518, 523) — buildFilter builds Meilisearch filter strings via raw interpolation of user-controlled values.
    collectionSlug, collectionSlugs, and the facet value filter (and/or) all flow into the filter string without escaping " or \. A slug or facet value containing those characters will produce malformed filters or unexpected semantics. Please add an escape helper for filter literals and route all user-supplied values through it.

  6. indexer.controller.ts:743-746 — silent failure in variant indexing.
    Logger.error(err.toString()) is called without loggerCtx and without the stack, and the rethrown Error('Error while reindexing!') drops the original cause. Please use Logger.error(err.message, loggerCtx, err.stack) and either rethrow err or wrap it with cause. Compare to lines 437/458 which already do this correctly.

  7. indexer.controller.ts:522-533 and 607-609 — addDocumentsBatch / deleteDocuments swallow errors during reindex.
    A failed batch during reindex means part of the catalog is missing from the temp index, and the swap then promotes that partial index to live. The job still reports success. Please either throw on failure, or accumulate a failure count and surface it on the job result.

  8. meilisearch.service.ts:446-450generatePriceBuckets ignores per-bucket errors.
    The catch {} block means an outage or partial failure produces a degraded silent response. Either propagate (the calling priceRange() already has a top-level catch) or log with loggerCtx.

  9. indexer.controller.ts:104-106 and meilisearch.service.ts:39 — two separate MeiliSearch client instances in-process.
    The controller and the service each getClient(this.options) in their own onModuleInit. That can be deliberate (each owns a connection pool), but please either share a single provider, or add a comment explaining why two are needed. Also: MeilisearchService doesn't implement OnModuleDestroy. Add it as a no-op with a comment if there's nothing to clean up — at least the lifecycle is then visibly considered.

  10. indexing-utils.ts:53-56getIndexUid silently strips characters.
    indexPrefix: 'shop@a-' and indexPrefix: 'shopa-' both become shopa-variants after the regex. Two Vendure instances sharing a Meilisearch cluster could collide silently. Please either throw on invalid characters in mergeWithDefaults (fail-fast), or document the rule prominently in the README's Multi-Project Setup section.

Non-blocking / improvements

  1. meilisearch.service.ts:119searchParams: any. Use the Meilisearch SDK's SearchParams type. Lines 134-172 manually copy 11 options from sc to searchParams; a typed pick or spread will be safer and more readable.

  2. meilisearch.service.ts:201, 250(result as any).estimatedTotalHits || (result as any).totalHits || 0 is repeated four times. Extract a typed helper getTotalHitCount(result).

  3. meilisearch.service.ts:418-454generatePriceBuckets issues N sequential searches. Wide price ranges with small intervals (e.g. $0–$10,000 at interval: 100) produce 100 sequential round-trips per priceRange call. Consider computing buckets client-side from a single facet histogram (Meilisearch exposes facetStats), or parallelise with a concurrency cap.

  4. plugin.ts:135-138static init silently overwrites MeilisearchPlugin.options on repeat calls. Add a guard or a Logger.warn if already set; helpful for test setups.

  5. plugin.ts:108, 116as any cast on generateSchemaExtensions(MeilisearchPlugin.options as any). The function signature takes MeilisearchOptions but is invoked with MeilisearchRuntimeOptions. Please widen the parameter type at api-extensions.ts:6 to accept either.

  6. indexer.controller.ts:225-319reindex is ~95 lines with three concerns interleaved. Extract createTempIndex / populateTempIndex / swapAndPromote as private methods named for their phase. This file is the one a future maintainer will avoid touching, and structure helps.

  7. indexer.controller.ts:710-713Math.min(...prices). Throws RangeError for very large arrays (>~125k due to spread arg limit), and returns Infinity for an empty array. Switch to reduce.

  8. indexer.controller.ts:849-851getId uses ${channelId}_${entityId}_${languageCode} with no separator escaping. Vendure supports custom ID strategies; a channel id containing _ would collide. The elasticsearch-plugin shares this pattern, so it's consistent — but worth a comment noting the assumption.

  9. meilisearch.health.ts:13-25isHealthy() runs the full checkConnection() retry loop. With defaults that's up to 10 × 5000 ms = 50s per probe. A health check should fail fast — use a single ping with a short timeout.

  10. indexer.controller.ts:388-410updateAssetDocuments increments offset += limit after upserting documents whose IDs match the original filter. This works only because the upserts keep the same matching filter (the asset id is unchanged). Add a comment explaining the invariant; without it this loop reads as suspect.

  11. meilisearch-index.service.ts:166-178asset as any. Cast to JsonCompatible<Required<Asset>> (the actual job data type) instead.

  12. indexing-utils.ts:148-165 — three near-identical blocks for updateSynonyms / updateStopWords / updateRankingRules. Extract applyOptionalSetting(name, value, updaterFn).

  13. options.ts:622-635mergeWithDefaults uses deepmerge with four special-cased fields. Either use deepmerge's arrayMerge option for consistency, or drop deepmerge entirely (the options object is shallow at the top level apart from searchConfig).

  14. No unit tests. The elasticsearch-plugin ships build-elastic-body.spec.ts, variant-price-utils.spec.ts, and indexing-id-helpers.spec.ts. The meilisearch-plugin currently has only e2e tests. buildFilter (with non-trivial AND/OR branching) and expandSynonymsBidirectional (with overlap-group merging) are pure functions and belong in unit tests; e2e against a live Meilisearch can't easily exercise filter-injection edge cases.

  15. README.md:18 — "v1.6+" is unenforced. The SDK constraint is meilisearch: ^0.55.0. Pin server compatibility via a startup version probe (similar to elasticsearch-plugin), or document the rule prominently.

  16. README.md is missing some sections of the shipping surface:

    • No bufferUpdates section despite the option being shipped.
    • No mention that the plugin replaces DefaultSearchPlugin (the plugin docstring says this; the README doesn't).
    • No troubleshooting (e.g. what happens if Meilisearch is unreachable at startup).
  17. meilisearch.service.ts:23 — unused import MeiliSearch.

  18. indexer.controller.ts:128-149assignProductToChannel / removeProductFromChannel ignore the channelId parameter and re-index against the ctx channel. Either use the channelId to scope the reindex, or add a comment explaining why ignoring it is correct (presumably because the ProductChannelEvent has already mutated the DB and updateProductsInternal re-reads the channel binding).

  19. plugin.ts:160-181 — event subscribers don't .catch their handler promises. RxJS will surface that as an unhandled error on the subscription chain. Either add .catch to each handler, or pipe through tap + catchError.

  20. plugin.ts:146-148 — startup-error path logs JSON.stringify(e). That produces {} for Error instances (because message and stack are non-enumerable). Use e.message/e.stack like the rest of the codebase.

Vendure-convention deviations vs @vendure/elasticsearch-plugin

  • No OnModuleDestroy on MeilisearchService. Implement as a no-op with a comment.
  • No extraction of buildFilter/buildSort to a testable pure module. The ES plugin pulls these into build-elastic-body.ts specifically because they're the most mistake-prone code.
  • Multi-currency awareness. ES plugin handles per-channel currency carefully. The meilisearch indexer at indexer.controller.ts:792 uses ctx.currencyCode without iterating Channel.availableCurrencyCodes. Please confirm this is correct for multi-currency channels, or document the limitation.
  • pendingSearchIndexUpdates SDL. The resolver at meilisearch-resolver.ts:72-76 returns the value, but I don't see the query declared in api-extensions.ts. Please verify it's reachable (it may be registered by SearchJobBufferService independently; if so, ignore).

Skipped

AI/embedder code (hybrid retry, similarDocuments, embedder configs, AI sections of options.ts) was skipped — that'll be reviewed when it lands in the sibling plugin.


Apologies for the volume — the size reflects the size of the PR, not concerns about the work. Please address the blocking items and pick whichever non-blocking ones you have time for; raise any you'd like to defer and we can decide per-item.

@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 21, 2026

@Ryrahul thanks for confirming on the maintenance side — that gives us enough to move forward. We can park the AI/embedder layer for a sibling meilisearch-ai plugin as discussed, and focus this PR on the keyword/full-text core.

I pulled the branch and ran a thorough code review on the non-AI portions. Findings below — please work through the blocking items before we merge, and consider the non-blocking ones. Everything below is against the core (keyword) code; AI paths were intentionally skipped per the scope agreement.

Summary

The core implementation is in solid shape for a first-time community PR of this size, and it leans on @vendure/elasticsearch-plugin as a template, which is the right move. That said, there are several correctness bugs around reindex failure handling, multi-channel deletes, and filter-string building that need to be addressed before merge.

Blocking issues

  1. indexer.controller.ts:299-308 — reindex swap failure silently leaves stale data live.
    The catch logs, deletes the temp index, and then the function continues to Logger.verbose('Completed reindexing!') — so the job resolves as success while the primary index is still stale. Please rethrow after cleanup so the job fails loudly.
  2. indexer.controller.ts:282-298 — first-time reindex has a non-atomic gap between temp populate and swap.
    On first-time indexing, createIndex(primaryIndexUid) is called after the temp index is fully populated, then swapIndexes runs. There's no configureIndex call on the new primary before the swap, and the live index briefly has no documents. Please verify against Meilisearch docs that the swap actually moves settings + filterable/searchable/sortable attributes; if not, apply settings explicitly post-swap.
  3. indexer.controller.ts:551-596deleteProductOperations channel scoping is wrong.
    The product is loaded with andWhere('channel.id = :channelId', { channelId: ctx.channelId }). If the deletion runs from a ctx whose channel isn't one the product belongs to, the query returns undefined and the function exits early — the product is never deleted from the index. Either drop the channel filter on the load (you're already iterating across all channels in the delete loop), or scope the delete loop to ctx.channelId only. The elasticsearch-plugin loads without the channel filter for this case.
  4. indexer.controller.ts:185-192deleteVariants doesn't go through asyncQueue.
    updateVariants uses this.asyncQueue.push(...); deleteVariants does not. That allows a delete to race an in-flight update for the same product. Wrap the body in asyncQueue.push to match.
  5. meilisearch.service.ts:118 (and lines 499, 502, 518, 523) — buildFilter builds Meilisearch filter strings via raw interpolation of user-controlled values.
    collectionSlug, collectionSlugs, and the facet value filter (and/or) all flow into the filter string without escaping " or \. A slug or facet value containing those characters will produce malformed filters or unexpected semantics. Please add an escape helper for filter literals and route all user-supplied values through it.
  6. indexer.controller.ts:743-746 — silent failure in variant indexing.
    Logger.error(err.toString()) is called without loggerCtx and without the stack, and the rethrown Error('Error while reindexing!') drops the original cause. Please use Logger.error(err.message, loggerCtx, err.stack) and either rethrow err or wrap it with cause. Compare to lines 437/458 which already do this correctly.
  7. indexer.controller.ts:522-533 and 607-609 — addDocumentsBatch / deleteDocuments swallow errors during reindex.
    A failed batch during reindex means part of the catalog is missing from the temp index, and the swap then promotes that partial index to live. The job still reports success. Please either throw on failure, or accumulate a failure count and surface it on the job result.
  8. meilisearch.service.ts:446-450generatePriceBuckets ignores per-bucket errors.
    The catch {} block means an outage or partial failure produces a degraded silent response. Either propagate (the calling priceRange() already has a top-level catch) or log with loggerCtx.
  9. indexer.controller.ts:104-106 and meilisearch.service.ts:39 — two separate MeiliSearch client instances in-process.
    The controller and the service each getClient(this.options) in their own onModuleInit. That can be deliberate (each owns a connection pool), but please either share a single provider, or add a comment explaining why two are needed. Also: MeilisearchService doesn't implement OnModuleDestroy. Add it as a no-op with a comment if there's nothing to clean up — at least the lifecycle is then visibly considered.
  10. indexing-utils.ts:53-56getIndexUid silently strips characters.
    indexPrefix: 'shop@a-' and indexPrefix: 'shopa-' both become shopa-variants after the regex. Two Vendure instances sharing a Meilisearch cluster could collide silently. Please either throw on invalid characters in mergeWithDefaults (fail-fast), or document the rule prominently in the README's Multi-Project Setup section.

Non-blocking / improvements

  1. meilisearch.service.ts:119searchParams: any. Use the Meilisearch SDK's SearchParams type. Lines 134-172 manually copy 11 options from sc to searchParams; a typed pick or spread will be safer and more readable.

  2. meilisearch.service.ts:201, 250(result as any).estimatedTotalHits || (result as any).totalHits || 0 is repeated four times. Extract a typed helper getTotalHitCount(result).

  3. meilisearch.service.ts:418-454generatePriceBuckets issues N sequential searches. Wide price ranges with small intervals (e.g. $0–$10,000 at interval: 100) produce 100 sequential round-trips per priceRange call. Consider computing buckets client-side from a single facet histogram (Meilisearch exposes facetStats), or parallelise with a concurrency cap.

  4. plugin.ts:135-138static init silently overwrites MeilisearchPlugin.options on repeat calls. Add a guard or a Logger.warn if already set; helpful for test setups.

  5. plugin.ts:108, 116as any cast on generateSchemaExtensions(MeilisearchPlugin.options as any). The function signature takes MeilisearchOptions but is invoked with MeilisearchRuntimeOptions. Please widen the parameter type at api-extensions.ts:6 to accept either.

  6. indexer.controller.ts:225-319reindex is ~95 lines with three concerns interleaved. Extract createTempIndex / populateTempIndex / swapAndPromote as private methods named for their phase. This file is the one a future maintainer will avoid touching, and structure helps.

  7. indexer.controller.ts:710-713Math.min(...prices). Throws RangeError for very large arrays (>~125k due to spread arg limit), and returns Infinity for an empty array. Switch to reduce.

  8. indexer.controller.ts:849-851getId uses ${channelId}_${entityId}_${languageCode} with no separator escaping. Vendure supports custom ID strategies; a channel id containing _ would collide. The elasticsearch-plugin shares this pattern, so it's consistent — but worth a comment noting the assumption.

  9. meilisearch.health.ts:13-25isHealthy() runs the full checkConnection() retry loop. With defaults that's up to 10 × 5000 ms = 50s per probe. A health check should fail fast — use a single ping with a short timeout.

  10. indexer.controller.ts:388-410updateAssetDocuments increments offset += limit after upserting documents whose IDs match the original filter. This works only because the upserts keep the same matching filter (the asset id is unchanged). Add a comment explaining the invariant; without it this loop reads as suspect.

  11. meilisearch-index.service.ts:166-178asset as any. Cast to JsonCompatible<Required<Asset>> (the actual job data type) instead.

  12. indexing-utils.ts:148-165 — three near-identical blocks for updateSynonyms / updateStopWords / updateRankingRules. Extract applyOptionalSetting(name, value, updaterFn).

  13. options.ts:622-635mergeWithDefaults uses deepmerge with four special-cased fields. Either use deepmerge's arrayMerge option for consistency, or drop deepmerge entirely (the options object is shallow at the top level apart from searchConfig).

  14. No unit tests. The elasticsearch-plugin ships build-elastic-body.spec.ts, variant-price-utils.spec.ts, and indexing-id-helpers.spec.ts. The meilisearch-plugin currently has only e2e tests. buildFilter (with non-trivial AND/OR branching) and expandSynonymsBidirectional (with overlap-group merging) are pure functions and belong in unit tests; e2e against a live Meilisearch can't easily exercise filter-injection edge cases.

  15. README.md:18 — "v1.6+" is unenforced. The SDK constraint is meilisearch: ^0.55.0. Pin server compatibility via a startup version probe (similar to elasticsearch-plugin), or document the rule prominently.

  16. README.md is missing some sections of the shipping surface:

    • No bufferUpdates section despite the option being shipped.
    • No mention that the plugin replaces DefaultSearchPlugin (the plugin docstring says this; the README doesn't).
    • No troubleshooting (e.g. what happens if Meilisearch is unreachable at startup).
  17. meilisearch.service.ts:23 — unused import MeiliSearch.

  18. indexer.controller.ts:128-149assignProductToChannel / removeProductFromChannel ignore the channelId parameter and re-index against the ctx channel. Either use the channelId to scope the reindex, or add a comment explaining why ignoring it is correct (presumably because the ProductChannelEvent has already mutated the DB and updateProductsInternal re-reads the channel binding).

  19. plugin.ts:160-181 — event subscribers don't .catch their handler promises. RxJS will surface that as an unhandled error on the subscription chain. Either add .catch to each handler, or pipe through tap + catchError.

  20. plugin.ts:146-148 — startup-error path logs JSON.stringify(e). That produces {} for Error instances (because message and stack are non-enumerable). Use e.message/e.stack like the rest of the codebase.

Vendure-convention deviations vs @vendure/elasticsearch-plugin

  • No OnModuleDestroy on MeilisearchService. Implement as a no-op with a comment.
  • No extraction of buildFilter/buildSort to a testable pure module. The ES plugin pulls these into build-elastic-body.ts specifically because they're the most mistake-prone code.
  • Multi-currency awareness. ES plugin handles per-channel currency carefully. The meilisearch indexer at indexer.controller.ts:792 uses ctx.currencyCode without iterating Channel.availableCurrencyCodes. Please confirm this is correct for multi-currency channels, or document the limitation.
  • pendingSearchIndexUpdates SDL. The resolver at meilisearch-resolver.ts:72-76 returns the value, but I don't see the query declared in api-extensions.ts. Please verify it's reachable (it may be registered by SearchJobBufferService independently; if so, ignore).

Skipped

AI/embedder code (hybrid retry, similarDocuments, embedder configs, AI sections of options.ts) was skipped — that'll be reviewed when it lands in the sibling plugin.

Apologies for the volume — the size reflects the size of the PR, not concerns about the work. Please address the blocking items and pick whichever non-blocking ones you have time for; raise any you'd like to defer and we can decide per-item.

Thanks @grolmus for the very detailed review. Sure, i will get the blockers sorted out first . then move on to non blockers

Ryrahul added 8 commits May 23, 2026 22:49
- Fix JSON.stringify(e) in startup error logging
- Warn on repeat MeilisearchPlugin.init() calls
- Add .catch to all event subscriber handlers
- Extract getTotalHitCount helper to reduce duplication
- Widen generateSchemaExtensions param type, remove as any casts
- Fix asset as any cast to JsonCompatible<Required<Asset>>
- Replace Math.min/max spread with reduce for large array safety
- Add single-ping health check instead of retry loop
- Add comments on getId separator, updateAssetDocuments offset,
  and channel handler channelId assumptions
…oubleshooting

Also adds unit tests for buildFilter, buildSort, escapeFilterValue,
and expandSynonymsBidirectional (39 tests covering filter construction,
injection escaping, synonym expansion, and edge cases).
@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 23, 2026

Hey @grolmus — thanks for the thorough review. I’ve addressed all 10 blocking issues along with the convention deviations. For the non-blocking improvements, I’ve addressed most of them as well.

Deferred items (happy to address before merge if preferred):

  • Type searchParams with the SDK’s SearchParams type — requires SDK type compatibility testing ( will need some time)
  • Extract reindex into sub-methods can do this, just didn't want to mix structural refactors with the bug fixes in the same round of changes. Happy to take it on as a follow-up
  • Startup version probe for Meilisearch v1.6+ , will add in a follow-up PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants