Skip to content

Conversation

@r3v5
Copy link
Contributor

@r3v5 r3v5 commented Jan 28, 2026

What does this PR do?

This PR is responsible for adding support for IVFFlat index in PGVector extending PGVectorVectorIOConfig allowing users to choose between HNSW and IVFFlat indexes for ANN search in pgvector along with configuring each index's' specific parameters.

Example of vector_io configuration in YAML:

vector_io:
  - provider_id: ${env.PGVECTOR_DB:+pgvector}
    provider_type: remote::pgvector
    config:
      host: ${env.PGVECTOR_HOST:=localhost}
      port: ${env.PGVECTOR_PORT:=5432}
      db: ${env.PGVECTOR_DB:=testvectordb}
      user: ${env.PGVECTOR_USER:=user}
      password: ${env.PGVECTOR_PASSWORD:=password}
      distance_metric: COSINE
      vector_index: 
        type: IVFFlat
        lists: 100
        probes: 10
      persistence:
        namespace: vector_io::pgvector
        backend: kv_default

Closes #4745

Test Plan

Llama Stack logs:

vector_io:
           - config:
               db: testvectordb
               distance_metric: COSINE
               host: localhost
               password: '********'
               persistence:
                 backend: kv_default
                 namespace: vector_io::pgvector
               port: 5432
               user: user
               vector_index:
                 lists: 100
                 probes: 10
                 type: IVFFlat
             provider_id: pgvector
             provider_type: remote::pgvector

INFO     2026-01-28 17:42:02,542 llama_stack.providers.remote.vector_io.pgvector.pgvector:514 vector_io::pgvector:
         Initializing PGVector memory adapter with config: {'host': 'localhost', 'port': 5432, 'db': 'testvectordb',
         'user': 'user', 'distance_metric': 'COSINE', 'vector_index': {'type': <PGVectorIndexType.IVFFlat: 'IVFFlat'>,
         'lists': 100, 'probes': 10}, 'persistence': {'namespace': 'vector_io::pgvector', 'backend': 'kv_default'},
         'password': '******'}

Creation of index:

INFO     2026-01-28 17:47:22,756 llama_stack.providers.remote.vector_io.pgvector.pgvector:477 vector_io::pgvector:
         Checking vector_store: vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5 for conflicting vector index in PGVector...
INFO     2026-01-28 17:47:22,759 llama_stack.providers.remote.vector_io.pgvector.pgvector:490 vector_io::pgvector:
         vector_store: vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5 currently doesn't have conflicting vector index
INFO     2026-01-28 17:47:22,759 llama_stack.providers.remote.vector_io.pgvector.pgvector:491 vector_io::pgvector:
         Proceeding with creation of vector index for vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5
INFO     2026-01-28 17:47:22,770 llama_stack.providers.remote.vector_io.pgvector.pgvector:461 vector_io::pgvector:
         IVFFlat vector index was created with parameter lists = 100 for vector_store:
         vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5.
testvectordb=# SELECT indexname, indexdef
FROM pg_indexes
WHERE indexdef LIKE '%ivfflat%';
                       indexname                        |                                                                                        indexdef
--------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx | CREATE INDEX vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx ON public.vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5 USING ivfflat (embedding vector_cosine_ops) WITH (lists='100')
(1 row)

testvectordb=#

Vector search via IVFFLAT index:

Query Plan:

Limit  (cost=1.26..7.53 rows=1 width=40) (actual time=0.567..0.712 rows=6.00 loops=1)
  Buffers: shared hit=123
  InitPlan 1
    ->  Limit  (cost=0.00..1.01 rows=1 width=32) (actual time=0.017..0.019 rows=1.00 loops=1)
          Buffers: shared hit=1
          ->  Seq Scan on vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5 vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_1  (cost=0.00..1.01 rows=1 width=32) (actual time=0.017..0.017 rows=1.00 loops=1)
                Disabled: true
                Buffers: shared hit=1
  ->  Index Scan using vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx on vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5  (cost=0.25..6.52 rows=1 width=40) (actual time=0.565..0.705 rows=6.00 loops=1)
        Order By: (embedding <=> (InitPlan 1).col1)
        Index Searches: 0
        Buffers: shared hit=123
Planning:
  Buffers: shared hit=93
Planning Time: 1.649 ms
Execution Time: 0.801 ms

Flexibility - different indexes can be configured on Llama Stack startup without breaking existing vector stores:

~ podman exec -it pgvector psql -U user -d testvectordb
psql (18.1 (Debian 18.1-1.pgdg13+2))
Type "help" for help.

testvectordb=# \z
                                               Access privileges
 Schema |                    Name                    | Type  | Access privileges | Column privileges | Policies
--------+--------------------------------------------+-------+-------------------+-------------------+----------
 public | metadata_store                             | table |                   |                   |
 public | vs_vs_c3750d30_555c_4579_a0f7_8604eb218b08 | table |                   |                   |
 public | vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5 | table |                   |                   |
(3 rows)

testvectordb=# SELECT indexname, indexdef
FROM pg_indexes
WHERE indexdef LIKE '%ivfflat%';
                       indexname                        |                                                                                        indexdef
--------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx | CREATE INDEX vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx ON public.vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5 USING ivfflat (embedding vector_cosine_ops) WITH (lists='100')
(1 row)

testvectordb=# SELECT indexname
FROM pg_indexes
WHERE indexname LIKE '%hnsw%';
                      indexname
-----------------------------------------------------
 vs_vs_c3750d30_555c_4579_a0f7_8604eb218b08_hnsw_idx
(1 row)

testvectordb=#
INFO     2026-01-29 18:00:56,903 llama_stack.providers.remote.vector_io.pgvector.pgvector:579 vector_io::pgvector:
         Initializing PGVector memory adapter with config: {'host': 'localhost', 'port': 5432, 'db': 'testvectordb',
         'user': 'user', 'distance_metric': 'COSINE', 'vector_index': {'type': <PGVectorIndexType.HNSW: 'HNSW'>, 'm':
         16, 'ef_construction': 64}, 'persistence': {'namespace': 'vector_io::pgvector', 'backend': 'kv_default'},
         'password': '******'}
INFO     2026-01-29 18:00:56,921 llama_stack.providers.remote.vector_io.pgvector.pgvector:595 vector_io::pgvector:
         Vector extension version: 0.8.1
INFO     2026-01-29 18:00:56,923 llama_stack.providers.remote.vector_io.pgvector.pgvector:496 vector_io::pgvector:
         Checking vector_store: vs_5b93e416-01a1-4b11-8316-038f7e66abae for conflicting vector index in PGVector...
WARNING  2026-01-29 18:00:56,925 llama_stack.providers.remote.vector_io.pgvector.pgvector:513 vector_io::pgvector:
         Conflicting vector index vs_vs_5b93e416_01a1_4b11_8316_038f7e66abae_hnsw_idx already exists in vector_store:
         vs_5b93e416-01a1-4b11-8316-038f7e66abae
WARNING  2026-01-29 18:00:56,925 llama_stack.providers.remote.vector_io.pgvector.pgvector:516 vector_io::pgvector:
         vector_store: vs_5b93e416-01a1-4b11-8316-038f7e66abae will continue to use vector index
         vs_vs_5b93e416_01a1_4b11_8316_038f7e66abae_hnsw_idx to preserve performance.
INFO     2026-01-29 18:00:56,926 llama_stack.providers.remote.vector_io.pgvector.pgvector:496 vector_io::pgvector:
         Checking vector_store: vs_c3750d30-555c-4579-a0f7-8604eb218b08 for conflicting vector index in PGVector...
WARNING  2026-01-29 18:00:56,927 llama_stack.providers.remote.vector_io.pgvector.pgvector:513 vector_io::pgvector:
         Conflicting vector index vs_vs_c3750d30_555c_4579_a0f7_8604eb218b08_hnsw_idx already exists in vector_store:
         vs_c3750d30-555c-4579-a0f7-8604eb218b08
WARNING  2026-01-29 18:00:56,927 llama_stack.providers.remote.vector_io.pgvector.pgvector:516 vector_io::pgvector:
         vector_store: vs_c3750d30-555c-4579-a0f7-8604eb218b08 will continue to use vector index
         vs_vs_c3750d30_555c_4579_a0f7_8604eb218b08_hnsw_idx to preserve performance.
INFO     2026-01-29 18:00:56,928 llama_stack.providers.remote.vector_io.pgvector.pgvector:496 vector_io::pgvector:
         Checking vector_store: vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5 for conflicting vector index in PGVector...
WARNING  2026-01-29 18:00:56,929 llama_stack.providers.remote.vector_io.pgvector.pgvector:513 vector_io::pgvector:
         Conflicting vector index vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx already exists in vector_store:
         vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5
WARNING  2026-01-29 18:00:56,929 llama_stack.providers.remote.vector_io.pgvector.pgvector:516 vector_io::pgvector:
         vector_store: vs_e545f930-8da0-487d-a9a6-03b3fe26c8b5 will continue to use vector index
         vs_vs_e545f930_8da0_487d_a9a6_03b3fe26c8b5_ivfflat_idx to preserve performance.
INFO     2026-01-29 18:00:57,688 llama_stack.providers.utils.inference.openai_mixin:484 providers::utils:
         OpenAIInferenceAdapter.list_provider_model_ids() returned 121 models
INFO     2026-01-29 18:00:57,878 uvicorn.error:84 uncategorized: Started server process [17376]
INFO     2026-01-29 18:00:57,879 uvicorn.error:48 uncategorized: Waiting for application startup.
INFO     2026-01-29 18:00:57,880 llama_stack.core.server.server:175 core::server: Starting up Llama Stack server
         (version: 0.4.0.dev0)
INFO     2026-01-29 18:00:57,881 llama_stack.core.stack:702 core: starting registry refresh task
INFO     2026-01-29 18:00:57,881 uvicorn.error:62 uncategorized: Application startup complete.
INFO     2026-01-29 18:00:57,881 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321
         (Press CTRL+C to quit)

IVFFlat index only is being created if a vector store has enough data:

INFO     2026-01-29 18:12:56,344 uvicorn.access:476 uncategorized: ::1:62041 - "POST /v1/vector_stores HTTP/1.1" 200
INFO     2026-01-29 18:13:31,071 llama_stack.providers.utils.inference.embedding_mixin:93 providers::utils: Loading
         sentence transformer for nomic-ai/nomic-embed-text-v1.5...
WARNING  2026-01-29 18:13:36,188
         transformers_modules.nomic_hyphen_ai.nomic_hyphen_bert_hyphen_2048.7710840340a098cfb869c4f65e87cf2b1b70caca.mod
         eling_hf_nomic_bert:466 uncategorized: <All keys matched successfully>
INFO     2026-01-29 18:13:37,178 llama_stack.providers.remote.vector_io.pgvector.pgvector:496 vector_io::pgvector:
         Checking vector_store: vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9 for conflicting vector index in PGVector...
INFO     2026-01-29 18:13:37,180 llama_stack.providers.remote.vector_io.pgvector.pgvector:521 vector_io::pgvector:
         vector_store: vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9 currently doesn't have conflicting vector index
INFO     2026-01-29 18:13:37,180 llama_stack.providers.remote.vector_io.pgvector.pgvector:522 vector_io::pgvector:
         Proceeding with creation of vector index for vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9
INFO     2026-01-29 18:13:37,180 llama_stack.providers.remote.vector_io.pgvector.pgvector:541 vector_io::pgvector:
         Fetching number of records in vector_store: vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9...
INFO     2026-01-29 18:13:37,181 llama_stack.providers.remote.vector_io.pgvector.pgvector:553 vector_io::pgvector:
         vector_store: vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9 has 4 records.
INFO     2026-01-29 18:13:37,181 llama_stack.providers.remote.vector_io.pgvector.pgvector:459 vector_io::pgvector:
         IVFFlat index wasn't created for vector_store vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9 because table doesn't
         have enough records.
INFO     2026-01-29 18:13:37,188 uvicorn.access:476 uncategorized: ::1:62055 - "POST
         /v1/vector_stores/vs_6a1fd240-f6b6-4c0a-b6be-5c8a68b799b9/files HTTP/1.1" 200

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 28, 2026
@r3v5 r3v5 force-pushed the add-support-for-IVFFlat-vector-index-in-pgvector branch 4 times, most recently from d836864 to 0d8c417 Compare January 29, 2026 15:23
…n PGVector

Signed-off-by: Ian Miller <milleryan2003@gmail.com>
@r3v5 r3v5 force-pushed the add-support-for-IVFFlat-vector-index-in-pgvector branch from 0d8c417 to 0e2e224 Compare January 29, 2026 18:39
Copy link
Contributor

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM - one question


Via Docker:
```bash
docker pull pgvector/pgvector:0.8.1-pg18-trixie
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we are pulling this particular tag for the pgvector container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nathan-weinberg ! No, there is no particular reason. As far as I understand it’s simply the latest image of pgvector available. That’s why I chose it and used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yeah, weird they don't have some latest image: https://hub.docker.com/r/pgvector/pgvector/tags

Fine from my side in this case then - thanks!

@franciscojavierarceo franciscojavierarceo merged commit bdeff00 into llamastack:main Jan 29, 2026
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for IVFFlat (Inverted File with Flat Compression) vector index for ANN search in PGVector

3 participants