Embedding Operations

This page explains the commands used to build, refresh, swap, and clean up semantic search embeddings in Memory Layer, and how to configure multiple embedding backends in parallel.

When You Need This
How Embeddings Work
Configuring Multiple Backends
Commands
Typical Workflows
Troubleshooting

When You Need This

Use these commands when:

you enabled embeddings for the first time
you want vector search for existing memories
you changed the embedding model, or want to keep two models populated at once
you want to switch which backend search uses without recomputing
you want to clean up old embedding spaces after a model switch

How Embeddings Work

Memory Layer stores chunk embeddings in PostgreSQL with pgvector.

Every embedding row is keyed by a space key of the form provider|base_url|model, so vectors from different providers and models coexist without collision. A single chunk can have vectors in several spaces at once.

At any time, one configured backend can be active: that's the one memory query uses for semantic retrieval. You can disable semantic retrieval with enabled = false, which leaves configured backends and stored vectors in place while semantic retrieval is off. Swapping activation is a constant-time metadata flip — no recomputation — as long as the target space is already populated.

When a backend has create_enabled = true, every new memory is embedded into that backend automatically. The curate step that runs after memory remember (and after the watcher's idle captures) writes chunks into every enabled space declared under [[embeddings.backends]], not just the active one. So configuring two enabled backends from the start means you never have to run reembed later just to switch.

Set create_enabled = false on a backend to stop automatic provider embedding calls for that backend after curation or bundle import. Explicit reembed and reindex commands still work so you can backfill manually when you are ready.

Heads up on cost: with two backends configured, each new memory hits both providers' embedding APIs. That's usually negligible for text-embedding-3-small and voyage-code-3 (both in the low cents per thousand writes), but worth keeping in mind if you add a premium model.

Configuring Multiple Backends

Declare every backend you want available under [[embeddings.backends]] and pick one with [embeddings].active:

[embeddings]
active = "voyage-code"

[[embeddings.backends]]
name = "openai-3-small"
provider = "openai"
base_url = ""
api_key_env = "OPENAI_API_KEY"
model = "text-embedding-3-small"
batch_size = 16
# Optional for OpenAI text-embedding-3 models:
# dimensions = 512

[[embeddings.backends]]
name = "voyage-code"
provider = "voyage"
base_url = ""
api_key_env = "VOYAGE_API_KEY"
model = "voyage-code-3"
batch_size = 16

[[embeddings.backends]]
name = "ollama-nomic"
provider = "ollama"
base_url = "http://127.0.0.1:11434/v1"
api_key_env = ""
model = "nomic-embed-text"
batch_size = 16

The name field is your activation handle and must be unique. If you leave name empty, Memory Layer derives one from {provider}-{model} at load time.

Set enabled = false under [embeddings] to keep backend declarations available while turning semantic retrieval off. The TUI Embeddings tab writes that flag when you press Enter on the currently active backend row, and writes enabled = true again when you activate a backend.

Set create_enabled = false inside a specific [[embeddings.backends]] block to keep semantic search available for existing vectors while preventing automatic creation of new embeddings for that provider. In the TUI Embeddings tab, highlight a backend and press c to toggle this value.

Use provider = "openai" for the official OpenAI embeddings API. Use provider = "openai_compatible" for hosted or proxy APIs that mimic OpenAI's /embeddings endpoint but may not support OpenAI-specific request options. Use provider = "ollama" for local Ollama embeddings with no API key by default. For OpenAI text-embedding-3 models, dimensions = <n> is optional and asks OpenAI to return a smaller vector.

The legacy singleton shape still works:

[embeddings]
provider = "voyage"
model = "voyage-code-3"
api_key_env = "VOYAGE_API_KEY"

Internally this is normalized to a one-element backends list with an auto-derived name, so memory embeddings list will show the same information.

base_url = "" falls back to the provider's well-known endpoint (https://api.openai.com/v1, https://api.voyageai.com, http://127.0.0.1:11434/v1, etc.).

Commands

List configured backends and show which is active:

memory embeddings list

Output marks the active backend with * and any backend that didn't resolve at startup (missing API key, empty model) with !.

The same information, plus per-project chunk and memory counts, is available interactively in the Embeddings Tab of the TUI.

Switch which backend search uses:

memory embeddings activate voyage-code

The service rewrites [embeddings].active in the config file and updates its in-memory state without restarting. Existing embeddings for the new space are used immediately; nothing is recomputed.

To turn embeddings off interactively, open memory tui, go to the Embeddings tab, highlight the active row, and press Enter. The service rewrites [embeddings].enabled = false; pressing Enter on any ready backend turns it back on.

To stop only automatic creation of new embeddings for one provider, highlight that backend in the Embeddings tab and press c. Manual reembed and reindex still create embeddings.

Build chunks and embeddings for a project:

memory embeddings reindex --project my-project

This is the heavy rebuild path. It recreates the project's shared chunks and then populates every configured backend so all spaces stay in sync.

--backend is available for compatibility, but it is intentionally safe:

memory embeddings reindex --project my-project --backend voyage-code

With --backend, reindex does not delete or recreate shared chunks. Instead it fills missing embeddings for that backend's space, preserving embeddings already stored for OpenAI, Voyage, Ollama, or any other configured backend. Use the no---backend form when you really want to rebuild chunks.

Preview without writing:

memory embeddings reindex --project my-project --dry-run

Refresh only the embeddings of configured backends for a project (does not rebuild chunks):

memory embeddings reembed --project my-project
memory embeddings reembed --project my-project --backend voyage-code
memory embeddings reembed --project my-project --dry-run

Use reembed when:

you added a new backend to config and want to populate its space
an existing backend's space is only partially covered
you prefer not to do the full reindex chunk rebuild
you want to switch quickly between providers without paying to recompute embeddings that are already present for other backends

Delete embedding rows whose space isn't in any configured backend:

memory embeddings prune --project my-project
memory embeddings prune --project my-project --dry-run

prune operates relative to the set of currently configured backends (not just the active one), so removing a backend from config before pruning is the right order when you want to retire a model completely.

Typical Workflows

Your config files live in the locations listed under Getting Started → File Locations — on Debian that's /etc/memory-layer/memory-layer.toml and /etc/memory-layer/memory-layer.env; on Linux user-level installs it's ~/.config/memory-layer/memory-layer.toml and ~/.config/memory-layer/memory-layer.env.

Enable embeddings for the first time (single backend)

Add your API key to memory-layer.env:
```
OPENAI_API_KEY=sk-proj-...
```

Add an [embeddings] block to memory-layer.toml:

[embeddings]
provider = "openai"
base_url = ""
api_key_env = "OPENAI_API_KEY"
model = "text-embedding-3-small"
batch_size = 16

Restart the service. Confirm the setup:

memory doctor
memory embeddings list          # should show one backend, no "!"

Backfill embeddings for existing memories:

memory embeddings reindex --project my-project

Enable two backends from day one

Do this if you know up-front you want the option to switch models freely — it avoids having to reembed the whole corpus later.

Put both API keys in memory-layer.env:

OPENAI_API_KEY=sk-proj-...
VOYAGE_API_KEY=pa-...

Declare both backends in memory-layer.toml and pick one with active:

[embeddings]
active = "voyage-code"

[[embeddings.backends]]
name = "openai-3-small"
provider = "openai"
base_url = ""
api_key_env = "OPENAI_API_KEY"
model = "text-embedding-3-small"
batch_size = 16

[[embeddings.backends]]
name = "voyage-code"
provider = "voyage"
base_url = ""
api_key_env = "VOYAGE_API_KEY"
model = "voyage-code-3"
batch_size = 16

Restart the service. Confirm both resolved:

memory embeddings list          # both listed; active marked with *; neither marked with !

Backfill every existing memory into both spaces (default behavior — no --backend flag):
```
memory embeddings reindex --project my-project
```
From here on, every new memory is embedded into both spaces automatically. Swap which one search uses any time:
```
memory embeddings activate openai-3-small
memory embeddings activate voyage-code
```

Add a second backend to an existing install

Same end state as the two-backend workflow, just applied incrementally:

Add the new [[embeddings.backends]] block and its API key line in memory-layer.env.
Restart the service. memory embeddings list should show both.

Backfill existing memories into the new space:

memory embeddings reembed --project my-project

Switch search to the new backend whenever you're ready:
```
memory embeddings activate <new-backend-name>
```

Retire a backend

Remove its [[embeddings.backends]] block from config. Restart the service.
memory embeddings prune --project my-project drops the orphaned space.

Troubleshooting

If semantic search is not working:

run memory doctor
confirm pgvector is installed and the vector extension exists in the target database
confirm at least one [[embeddings.backends]] entry has a non-empty model
confirm the API key env var referenced by api_key_env is present in memory-layer.env
memory embeddings list — the active backend should not be marked with !

If the active space's vectors are missing for some memories (semantic search returns fewer results than lexical), run:

memory embeddings reembed --project my-project --backend <active-name>

If a newly-added backend is marked ! even after a restart, check that the referenced API key env var is populated in memory-layer.env and that model is non-empty.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embedding Operations

Table of Contents

When You Need This

How Embeddings Work

Configuring Multiple Backends

Commands

Typical Workflows

Enable embeddings for the first time (single backend)

Enable two backends from day one

Add a second backend to an existing install

Retire a backend

Troubleshooting

Related Docs

FilesExpand file tree

embeddings.md

Latest commit

History

embeddings.md

File metadata and controls

Embedding Operations

Table of Contents

When You Need This

How Embeddings Work

Configuring Multiple Backends

Commands

Typical Workflows

Enable embeddings for the first time (single backend)

Enable two backends from day one

Add a second backend to an existing install

Retire a backend

Troubleshooting

Related Docs