feat: local embedding provider for search by jeffmm · Pull Request #36 · 1st1/lat.md

jeffmm · 2026-03-22T19:04:26Z

Adds a local embedding provider so lat search works out of the box without an API key.

@huggingface/transformers as optional dep — in-process inference, no external server
Three pre-selected model sizes via LAT_LOCAL_MODEL_SIZE (small/medium/large, default small at 384 dims)
Discriminated union (kind: 'api' | 'local') on EmbeddingProvider for clean dispatch between API and local paths
ensureSchema tracks dimensions in a meta table and rebuilds the index on mismatch (e.g. switching providers)
Pipeline promise cached so concurrent callers share a single model load
Tests for provider detection, dimension mismatch, local embedding quality, and MCP no-key behavior

Full disclosure: Claude Code helped me write this PR with the help of lat.md's lat.md 😄

Closes #8

vercel · 2026-03-22T19:04:29Z

@jeffmm is attempting to deploy a commit to the Yury Selivanov's projects Team on Vercel.

A member of the Team first needs to authorize it.

1st1 · 2026-03-24T03:52:42Z

@jeffmm huge thanks! Let me ponder on this. Quick question: what if i just want to use LM Studio's embedding model? Should we design for that too?

@anbuzin Andrey, what do you think?

jeffmm · 2026-03-24T17:52:16Z

@1st1 I think adding support for custom endpoints would be easy enough to add, and would be a nice integration for people who are already running local servers (which I assume LM Studio does). We could prioritize it like:

LAT_LLM_ENDPOINT for users running local servers or otherwise want to use other OpenAI-compatible endpoints
LAT_LLM_API_KEY for users who have OpenAI or Vercel API keys
LAT_LOCAL_MODEL_SIZE as a fallback.

I could see to adding LAT_LLM_ENDPOINT as a fast follow-up PR.

nickshirobokov · 2026-03-25T23:31:56Z

@jeffmm looks awesome! Thank you!

I'd consider for the future PRs exploring better optimized local inference ('transformers.feature_extraction' on CPU might be really slow even for small models), and switching to late interaction instead of pooling since we already have multi-vector local models.

lars20070 · 2026-05-04T11:39:00Z

Adding the embedding models via Hugging Face has a great out-of-the-box user experience.
But I would suggest to review the alternative approach in #52 as well. IMO, only one of the two PRs needs to be merged.

nickshirobokov · 2026-05-08T20:15:34Z

@lars20070 I see the value in the provider-first approach, and I think it fits the project better today, but I disagree that it's future-proof. Most recent advancements in semantic search are focused on reducing model sizes, combining dense embeddings with static sparse embeddings, and merging vectors to make semantic search work locally in most deployments without any http requests. That's why I think lat.md should follow local-first approach.

jeffmm · 2026-05-09T00:03:33Z

I think adding a general API base is a good feature for lat.md, but I would not describe it as an alternative approach that is mutually exclusive to this PR. They are complimentary.

I managed to convince my team to use lat.md in production, but one of the biggest barriers for getting other teams to fully embrace it is that it requires extra setup to work well. Right now, the experience is degraded for many of our engineers whose agents are simply relying on lat locate and grep, and I've had to disable hooks and steer the agent away from lat search in the AGENTS.md to avoid complaints of poor agent behavior. An out-of-the-box experience is crucial for maintaining a low barrier to entry, and would increase the likelihood of adoption.

lars20070 · 2026-05-11T07:49:22Z

Thanks for the helpful replies — you're right that this isn't #36 vs #52, and either or both approaches can be adopted.

@jeffmm I've also come across several teams running lat.md in production, and the API-key setup keeps surfacing as a friction point. The longer that gap remains, the more likely teams are to build their own workarounds, which risks fragmenting the codebase.

It would be great to see #36 land soon. Happy to help review, test, or pick up follow-ups — just let me know what would be most useful.

grumd · 2026-05-13T12:13:12Z

@1st1 I can attest to what @jeffmm is saying. My team started using lat.md and it's been an amazing tool, but we don't use Vercel or OpenAI, so we have to resort to using lat locate and the agents keep trying lat search and getting errors. Sometimes agents will try lat search, get an error, and just skip lat altogether and get to work without it.

lat locate is rough to use, substring matching doesn't work. I can't match an existing section "Delete Transaction" by running lat locate 'Delete'. lat search is very crucial for lat to work like its supposed to.

This has been honestly the only and biggest barrier to using lat for us. I've cloned and built this PR and tried a few lat search commands on my existing project, and it works very well out of the box. API keys should be an optional feature for users who need higher quality embeddings, but a built-in small model that this PR provides is an incredible accessibility improvement. Please please get this merged if possible <3

@jeffmm What do you think about publishing your fork to npm until this is merged?

jeffmm · 2026-05-19T19:20:45Z

@lars20070 Thanks. It would be great if you can give the PR a review. I've been testing locally for several weeks now without issues, but I'd encourage you to give it a try locally as well. Personally I think these changes give lat.md a smooth "it just works" experience, but I'm happy to iterate on it based on your recommendations.

jeffmm added 10 commits March 23, 2026 09:23

feat: add local embedding option for search

9d4c0c2

feat: add env var for custom local model

2ee4dcb

feat: add better dimensionality detection

7aa292b

refactor: new local module for search, fix circular dependencies

2d7cd49

refactor: symmetric dimensions resolution between api and local provider

d3e4cbb

refactor: prefer pre-determined local models rather than allow custom

cb507a5

fix: remove unneeded approxMb cruft

9aa8fd0

fix: unsafe cast

b7f0cbb

fix: formatting

7397b8a

fix: explicit embedding type in tests

6fe644b

jeffmm force-pushed the jmm/local-embedding-provider branch from 5e362e0 to 6fe644b Compare March 23, 2026 15:26

This was referenced Mar 25, 2026

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers #40

Open

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers #41

Open

dundalek mentioned this pull request Mar 31, 2026

Add support for custom providers and local models #52

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: local embedding provider for search#36

feat: local embedding provider for search#36
jeffmm wants to merge 10 commits into
1st1:mainfrom
jeffmm:jmm/local-embedding-provider

jeffmm commented Mar 22, 2026

Uh oh!

vercel Bot commented Mar 22, 2026

Uh oh!

1st1 commented Mar 24, 2026

Uh oh!

jeffmm commented Mar 24, 2026

Uh oh!

nickshirobokov commented Mar 25, 2026

Uh oh!

lars20070 commented May 4, 2026 •

edited

Loading

Uh oh!

nickshirobokov commented May 8, 2026

Uh oh!

jeffmm commented May 9, 2026

Uh oh!

lars20070 commented May 11, 2026

Uh oh!

grumd commented May 13, 2026 •

edited

Loading

Uh oh!

jeffmm commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jeffmm commented Mar 22, 2026

Uh oh!

vercel Bot commented Mar 22, 2026

Uh oh!

1st1 commented Mar 24, 2026

Uh oh!

jeffmm commented Mar 24, 2026

Uh oh!

nickshirobokov commented Mar 25, 2026

Uh oh!

lars20070 commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nickshirobokov commented May 8, 2026

Uh oh!

jeffmm commented May 9, 2026

Uh oh!

lars20070 commented May 11, 2026

Uh oh!

grumd commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffmm commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lars20070 commented May 4, 2026 •

edited

Loading

grumd commented May 13, 2026 •

edited

Loading