Skip to content

benchmarks: add retrieval harness and docs#54

Open
shadow6427 wants to merge 1 commit into
Dipraise1:mainfrom
shadow6427:bench-suite
Open

benchmarks: add retrieval harness and docs#54
shadow6427 wants to merge 1 commit into
Dipraise1:mainfrom
shadow6427:bench-suite

Conversation

@shadow6427

Copy link
Copy Markdown

Closes #24

This PR implements the retrieval benchmark suite described in Phase 3 of the roadmap.

What's included:

  • scripts/bench/ Harness: An automated Python benchmarking suite (main.py) that evaluates Recall@K (1, 5, 10) and p50/p95 latency.
  • Dataset Generation: Uses the HuggingFace datasets library to pull a subset of BEIR (scifact by default) and compute embeddings using sentence-transformers.
  • Database Clients: Pluggable adapters (clients.py) for:
    • Engram (EngramBenchClient)
    • Pinecone (PineconeBenchClient)
    • Weaviate (WeaviateBenchClient)
    • pgvector (PgVectorBenchClient)
  • Docker Compose: Includes scripts/bench/docker-compose.bench.yml to effortlessly spin up local instances of Weaviate and PostgreSQL(pgvector).
  • Documentation: Automatically generates a docs/benchmarks.md markdown report when run.

Running the harness:

cd scripts/bench
docker compose -f docker-compose.bench.yml up -d
python main.py

(To generate a mocked version of the docs, run python main.py --mock)

@vercel

vercel Bot commented Jun 21, 2026

Copy link
Copy Markdown

@shadow6427 is attempting to deploy a commit to the praise's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[benchmarks] Retrieval benchmark suite — recall@K vs Pinecone, Weaviate, pgvector

1 participant