You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Semantic search currently only supports OpenAI (sk-...) and Vercel (vck_...) embedding providers. Users with Google AI keys or local OpenAI-compatible servers (LM Studio, Ollama) can't use lat search.
Proposal
Two additions to provider.ts:
1. Gemini auto-detection (AIza prefix)
Google provides an OpenAI-compatible embedding endpoint at generativelanguage.googleapis.com/v1beta/openai/embeddings. The request/response format is identical to OpenAI — same {model, input, dimensions} body, same {data: [{embedding, index}]} response, same Authorization: Bearer auth.
Benefits:
Free tier: 1,500 requests/minute at no cost — lowers the barrier for new lat.md users
Zero new dependencies: Just a new provider config + prefix detection
Gemini's gemini-embedding-001 supports dimension truncation via the dimensions parameter, so we request 1536 to match the existing DB schema
2. LAT_LLM_ENDPOINT for any OpenAI-compatible server
LAT_LLM_ENDPOINT=http://localhost:11434/v1 # Ollama, LM Studio, vLLM, etc.
LAT_LLM_MODEL=nomic-embed-text # optional, defaults to text-embedding-3-small
LAT_LLM_KEY=... # still required for auth
When LAT_LLM_ENDPOINT is set, it takes highest priority over key-prefix detection. This complements #36's local provider as the "custom API" tier in the priority chain:
LAT_LLM_ENDPOINT → custom OpenAI-compatible server
Small fix: the embed() function now includes dimensions in the request body. This is needed for Gemini (defaults to 3072, needs truncation to 1536) but is also correct for OpenAI which supports the same parameter.
Problem
Semantic search currently only supports OpenAI (
sk-...) and Vercel (vck_...) embedding providers. Users with Google AI keys or local OpenAI-compatible servers (LM Studio, Ollama) can't uselat search.Proposal
Two additions to
provider.ts:1. Gemini auto-detection (
AIzaprefix)Google provides an OpenAI-compatible embedding endpoint at
generativelanguage.googleapis.com/v1beta/openai/embeddings. The request/response format is identical to OpenAI — same{model, input, dimensions}body, same{data: [{embedding, index}]}response, sameAuthorization: Bearerauth.Benefits:
gemini-embedding-001supports dimension truncation via thedimensionsparameter, so we request 1536 to match the existing DB schema2.
LAT_LLM_ENDPOINTfor any OpenAI-compatible serverFollowing the design @jeffmm proposed in #36:
When
LAT_LLM_ENDPOINTis set, it takes highest priority over key-prefix detection. This complements #36's local provider as the "custom API" tier in the priority chain:LAT_LLM_ENDPOINT→ custom OpenAI-compatible serverLAT_LLM_KEYprefix detection → OpenAI, Vercel, Gemini3. Pass
dimensionsin embedding requestSmall fix: the
embed()function now includesdimensionsin the request body. This is needed for Gemini (defaults to 3072, needs truncation to 1536) but is also correct for OpenAI which supports the same parameter.PR
#41