Skip to content

feat: Gemini embedding provider + LAT_LLM_ENDPOINT for custom servers #40

Description

@kgray-wasteology

Problem

Semantic search currently only supports OpenAI (sk-...) and Vercel (vck_...) embedding providers. Users with Google AI keys or local OpenAI-compatible servers (LM Studio, Ollama) can't use lat search.

Proposal

Two additions to provider.ts:

1. Gemini auto-detection (AIza prefix)

Google provides an OpenAI-compatible embedding endpoint at generativelanguage.googleapis.com/v1beta/openai/embeddings. The request/response format is identical to OpenAI — same {model, input, dimensions} body, same {data: [{embedding, index}]} response, same Authorization: Bearer auth.

Benefits:

  • Free tier: 1,500 requests/minute at no cost — lowers the barrier for new lat.md users
  • Zero new dependencies: Just a new provider config + prefix detection
  • Gemini's gemini-embedding-001 supports dimension truncation via the dimensions parameter, so we request 1536 to match the existing DB schema

2. LAT_LLM_ENDPOINT for any OpenAI-compatible server

Following the design @jeffmm proposed in #36:

LAT_LLM_ENDPOINT=http://localhost:11434/v1  # Ollama, LM Studio, vLLM, etc.
LAT_LLM_MODEL=nomic-embed-text              # optional, defaults to text-embedding-3-small
LAT_LLM_KEY=...                             # still required for auth

When LAT_LLM_ENDPOINT is set, it takes highest priority over key-prefix detection. This complements #36's local provider as the "custom API" tier in the priority chain:

  1. LAT_LLM_ENDPOINT → custom OpenAI-compatible server
  2. LAT_LLM_KEY prefix detection → OpenAI, Vercel, Gemini
  3. (PR feat: local embedding provider for search #36) Local HuggingFace inference as fallback

3. Pass dimensions in embedding request

Small fix: the embed() function now includes dimensions in the request body. This is needed for Gemini (defaults to 3072, needs truncation to 1536) but is also correct for OpenAI which supports the same parameter.

PR

#41

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions