feat: local embedding provider for search#36
Conversation
|
@jeffmm is attempting to deploy a commit to the Yury Selivanov's projects Team on Vercel. A member of the Team first needs to authorize it. |
5e362e0 to
6fe644b
Compare
|
@1st1 I think adding support for custom endpoints would be easy enough to add, and would be a nice integration for people who are already running local servers (which I assume LM Studio does). We could prioritize it like:
I could see to adding |
|
@jeffmm looks awesome! Thank you! I'd consider for the future PRs exploring better optimized local inference ('transformers.feature_extraction' on CPU might be really slow even for small models), and switching to late interaction instead of pooling since we already have multi-vector local models. |
|
Adding the embedding models via Hugging Face has a great out-of-the-box user experience. |
|
@lars20070 I see the value in the provider-first approach, and I think it fits the project better today, but I disagree that it's future-proof. Most recent advancements in semantic search are focused on reducing model sizes, combining dense embeddings with static sparse embeddings, and merging vectors to make semantic search work locally in most deployments without any http requests. That's why I think lat.md should follow local-first approach. |
|
I think adding a general API base is a good feature for lat.md, but I would not describe it as an alternative approach that is mutually exclusive to this PR. They are complimentary. I managed to convince my team to use lat.md in production, but one of the biggest barriers for getting other teams to fully embrace it is that it requires extra setup to work well. Right now, the experience is degraded for many of our engineers whose agents are simply relying on |
|
Thanks for the helpful replies — you're right that this isn't #36 vs #52, and either or both approaches can be adopted. @jeffmm I've also come across several teams running It would be great to see #36 land soon. Happy to help review, test, or pick up follow-ups — just let me know what would be most useful. |
|
@1st1 I can attest to what @jeffmm is saying. My team started using lat.md and it's been an amazing tool, but we don't use Vercel or OpenAI, so we have to resort to using lat locate and the agents keep trying lat search and getting errors. Sometimes agents will try lat search, get an error, and just skip lat altogether and get to work without it.
This has been honestly the only and biggest barrier to using lat for us. I've cloned and built this PR and tried a few lat search commands on my existing project, and it works very well out of the box. API keys should be an optional feature for users who need higher quality embeddings, but a built-in small model that this PR provides is an incredible accessibility improvement. Please please get this merged if possible <3 @jeffmm What do you think about publishing your fork to npm until this is merged? |
|
@lars20070 Thanks. It would be great if you can give the PR a review. I've been testing locally for several weeks now without issues, but I'd encourage you to give it a try locally as well. Personally I think these changes give lat.md a smooth "it just works" experience, but I'm happy to iterate on it based on your recommendations. |
Adds a local embedding provider so
lat searchworks out of the box without an API key.@huggingface/transformersas optional dep — in-process inference, no external serverLAT_LOCAL_MODEL_SIZE(small/medium/large, defaultsmallat 384 dims)kind: 'api' | 'local') onEmbeddingProviderfor clean dispatch between API and local pathsensureSchematracks dimensions in ametatable and rebuilds the index on mismatch (e.g. switching providers)Full disclosure: Claude Code helped me write this PR with the help of lat.md's lat.md 😄
Closes #8