Vector Database REST API

A high-performance vector database REST API built with FastAPI

Quick Setup

Prerequisites

Docker Engine 20.10+ and Docker Compose 2.0+
4GB+ RAM recommended
2GB+ available disk space

1. Build the Docker Image

make build
# Or: docker build -t vectordb-api:latest .

2. Start the Vector Database

make run
# Or: docker run -d --name vectordb -p 8000:8000 -v vectordb_data:/app/data vectordb-api:latest

3. Verify it's Running

# Check container status
docker ps

# Test health endpoint
curl http://localhost:8000/health

4. Access the API

API Base: http://localhost:8000
Interactive Documentation: http://localhost:8000/docs
Health Check: http://localhost:8000/health

5. Quick API Test

# Create a library
curl -X POST "http://localhost:8000/libraries/" \
  -H "Content-Type: application/json" \
  -d '{"name": "Test Library", "metadata": {"description": "Demo library"}}'

# Note the library ID from response, then add a document
export LIBRARY_ID="your-library-id-here"

curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/documents/" \
  -H "Content-Type: application/json" \
  -d '{
    "document_data": {"metadata": {"title": "AI Document"}},
    "chunk_texts": ["Artificial intelligence is transforming technology."]
  }'

# Build index
curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/index" \
  -H "Content-Type: application/json" \
  -d '{"index_type": "optimized_linear", "force_rebuild": true}'

# Search
curl -X POST "http://localhost:8000/libraries/$LIBRARY_ID/search" \
  -H "Content-Type: application/json" \
  -d '{"query_text": "machine learning", "top_k": 3}'

6. Access Swagger UI

Open your browser
Navigate to http://localhost:8000/docs
You should see the interactive API documentation

7. Follow the Manual Testing Guide

Step 7.1: Create a Library

Click on POST /libraries
Use this JSON:

{
  "name": "AI Research Library",
  "metadata": {
    "description": "Collection of AI and ML documents",
    "category": "research"
  }
}

Copy the returned library_id

Step 7.2: Create a Document & add Chunks

Click on POST /libraries/{library_id}/documents
Replace {library_id} with your copied ID
Use this JSON:

{
  "document_data": {
    "metadata": {
      "title": "string",
      "author": "string",
      "created_at": "2025-09-30T06:24:03.501Z",
      "updated_at": "2025-09-30T06:24:03.501Z",
      "source": "string",
      "file_type": "string",
      "file_size": 0,
      "tags": [
        "string"
      ],
      "category": "string",
      "language": "string"
    }
  },
  "chunk_texts": [
    "Artificial intelligence (AI) is the simulation of human intelligence in machines to perform tasks and learn from experience, often involving complex reasoning, planning, and decision-making"
  ]
}
- Copy the returned `document_id`

**Step 7.3: Build Index**
- Click on **POST /libraries/{library_id}/index**
- Try different index types:
```json
{
  "index_type": "linear"
}

Step 7.4: Perform Search

Click on POST /libraries/{library_id}/search
Test semantic search:

{
  "query_text": "What is artificial intelligence?",
  "top_k": 5,
  "include_text": true,
  "include_metadata": true
}

Architecture Diagram

Complete System Architecture (Domain-Driven Design)

┌─────────────────────────────────────────────────────────────────────────────────┐
│                              CLIENT APPLICATIONS                                 │
│                                                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │  Web Browser │  │   Python SDK │  │     cURL     │  │  Mobile App  │       │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘       │
│         │                  │                  │                  │               │
│         └──────────────────┴──────────────────┴──────────────────┘               │
│                                    │                                             │
│                              HTTP/REST API                                       │
└────────────────────────────────────┼─────────────────────────────────────────────┘
                                     │
                                     ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                          LAYER 1: API LAYER (FastAPI)                           │
│                        Handles HTTP, Validation, Routing                        │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌────────────────────┐  ┌────────────────────┐  ┌─────────────────────┐      │
│  │  library_router.py │  │ document_router.py │  │  chunk_router.py    │      │
│  ├────────────────────┤  ├────────────────────┤  ├─────────────────────┤      │
│  │ POST   /libraries  │  │ POST /libraries/   │  │ POST /libraries/    │      │
│  │ GET    /libraries  │  │      {id}/docs     │  │      {id}/chunks    │      │
│  │ GET    /libraries/ │  │ GET  /libraries/   │  │ GET  /libraries/    │      │
│  │        {id}        │  │      {id}/docs/    │  │      {id}/chunks/   │      │
│  │ PUT    /libraries/ │  │      {doc_id}      │  │      {chunk_id}     │      │
│  │        {id}        │  │ PUT  /libraries/   │  │ PUT  /libraries/    │      │
│  │ DELETE /libraries/ │  │      {id}/docs/    │  │      {id}/chunks/   │      │
│  │        {id}        │  │      {doc_id}      │  │      {chunk_id}     │      │
│  └────────────────────┘  │ DELETE /libraries/ │  │ DELETE /libraries/  │      │
│                          │      {id}/docs/    │  │      {id}/chunks/   │      │
│  ┌────────────────────┐  │      {doc_id}      │  │      {chunk_id}     │      │
│  │  search_router.py  │  └────────────────────┘  └─────────────────────┘      │
│  ├────────────────────┤                                                         │
│  │ POST /libraries/   │  ┌────────────────────┐                                │
│  │      {id}/search   │  │  index_router.py   │                                │
│  │                    │  ├────────────────────┤                                │
│  │ - Query by text    │  │ POST /libraries/   │                                │
│  │ - Query by vector  │  │      {id}/index    │                                │
│  │ - Metadata filters │  │ GET  /libraries/   │                                │
│  │ - Top K results    │  │      {id}/index/   │                                │
│  └────────────────────┘  │      stats         │                                │
│                          └────────────────────┘                                │
│                                                                                  │
│  Responsibilities:                                                            │
│  • HTTP request/response handling                                               │
│  • Pydantic model validation                                                    │
│  • Status code management                                                       │
│  • Error transformation (exceptions → HTTP errors)                              │
│  • OpenAPI/Swagger documentation generation                                     │
│  • NO business logic here!                                                      │
│                                                                                  │
└──────────────────────────────────┬───────────────────────────────────────────────┘
                                   │
                                   │ Calls
                                   ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                        LAYER 2: SERVICE LAYER (Business Logic)                  │
│                      Orchestrates Operations, Enforces Rules                    │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌─────────────────────┐  ┌──────────────────────┐  ┌─────────────────────┐   │
│  │  library_service.py │  │  document_service.py │  │  chunk_service.py   │   │
│  ├─────────────────────┤  ├──────────────────────┤  ├─────────────────────┤   │
│  │ create_library()    │  │ create_document()    │  │ create_chunk()      │   │
│  │ get_library()       │  │ get_document()       │  │ get_chunk()         │   │
│  │ update_library()    │  │ update_document()    │  │ update_chunk()      │   │
│  │ delete_library()    │  │ delete_document()    │  │ delete_chunk()      │   │
│  │ list_libraries()    │  │ add_chunks_to_doc()  │  │ bulk_create()       │   │
│  └─────────────────────┘  └──────────────────────┘  └─────────────────────┘   │
│                                                                                  │
│  ┌─────────────────────┐  ┌──────────────────────┐  ┌─────────────────────┐   │
│  │  search_service.py  │  │   index_service.py   │  │ embedding_service.py│   │
│  ├─────────────────────┤  ├──────────────────────┤  ├─────────────────────┤   │
│  │ search_library()    │  │ build_index()        │  │ get_embedding()     │   │
│  │ - Apply filters     │  │ delete_index()       │  │ batch_embed()       │   │
│  │ - Call index search │  │ update_index()       │  │ - Cohere API        │   │
│  │ - Rank results      │  │ get_index_stats()    │  │ - Retry logic       │   │
│  │ - Format output     │  │ recommend_algo()     │  │ - Error handling    │   │
│  └─────────────────────┘  └──────────────────────┘  └─────────────────────┘   │
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐      │
│  │          metadata_filter_service.py                                   │      │
│  ├──────────────────────────────────────────────────────────────────────┤      │
│  │ apply_filters() - Pre/Post filtering logic                            │      │
│  │ parse_filter_expression() - Query language parsing                    │      │
│  └──────────────────────────────────────────────────────────────────────┘      │
│                                                                                  │
│  Responsibilities:                                                            │
│  • Business logic and rules enforcement                                         │
│  • Transaction coordination across repositories                                 │
│  • Integration with external services (embeddings)                              │
│  • Concurrency control (async/await, locking)                                   │
│  • Algorithm selection and optimization                                         │
│  • Error handling and domain exceptions                                         │
│                                                                                  │
└──────────────────────────────────┬───────────────────────────────────────────────┘
                                   │
                                   │ Uses
                                   ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                    LAYER 3: REPOSITORY LAYER (Data Access)                      │
│                         Abstracts Data Storage/Retrieval                        │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌──────────────────────┐  ┌──────────────────────┐  ┌──────────────────────┐ │
│  │library_repository.py │  │document_repository.py│  │ chunk_repository.py  │ │
│  ├──────────────────────┤  ├──────────────────────┤  ├──────────────────────┤ │
│  │ create(library)      │  │ create(document)     │  │ create(chunk)        │ │
│  │ get(id) → Library    │  │ get(id) → Document   │  │ get(id) → Chunk      │ │
│  │ update(id, library)  │  │ update(id, document) │  │ update(id, chunk)    │ │
│  │ delete(id) → bool    │  │ delete(id) → bool    │  │ delete(id) → bool    │ │
│  │ list() → List[...]   │  │ get_by_library()     │  │ get_by_library()     │ │
│  │ exists(id) → bool    │  │ count() → int        │  │ get_by_document()    │ │
│  └──────────────────────┘  └──────────────────────┘  │ get_vectors()        │ │
│                                                       └──────────────────────┘ │
│                          ┌────────────────────┐                                 │
│                          │   base.py          │                                 │
│                          ├────────────────────┤                                 │
│                          │ BaseRepository     │                                 │
│                          │ - CRUD interface   │                                 │
│                          │ - Common methods   │                                 │
│                          └────────────────────┘                                 │
│                                                                                  │
│  💾 Current Implementation: In-Memory Storage                                   │
│  ┌────────────────────────────────────────────────────────────────────┐        │
│  │  libraries: Dict[UUID, Library] = {}                                │        │
│  │  documents: Dict[UUID, Document] = {}                               │        │
│  │  chunks: Dict[UUID, Chunk] = {}                                     │        │
│  │                                                                      │        │
│  │  + asyncio.Lock() for thread-safety                                 │        │
│  └────────────────────────────────────────────────────────────────────┘        │
│                                                                                  │
│  📋 Responsibilities:                                                            │
│  • CRUD operations on domain entities                                           │
│  • Data persistence abstraction                                                 │
│  • Query implementation                                                         │
│  • Thread-safe access to shared state                                           │
│  • Can be swapped: InMemory → PostgreSQL → Redis (transparent to services)     │
│                                                                                  │
└──────────────────────────────────┬───────────────────────────────────────────────┘
                                   │
                                   │ Stores/Retrieves
                                   ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│                      LAYER 4: DOMAIN LAYER (Core Models)                        │
│                        Pydantic Models, Business Entities                       │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌────────────────────┐  ┌────────────────────┐  ┌────────────────────┐       │
│  │    library.py      │  │   document.py      │  │     chunk.py       │       │
│  ├────────────────────┤  ├────────────────────┤  ├────────────────────┤       │
│  │ class Library:     │  │ class Document:    │  │ class Chunk:       │       │
│  │   id: UUID         │  │   id: UUID         │  │   id: UUID         │       │
│  │   name: str        │  │   library_id: UUID │  │   text: str        │       │
│  │   description: str │  │   document_id: UUID│  │   embedding: List  │       │
│  │   index_type: str  │  │   metadata: dict   │  │   document_id: UUID│       │
│  │   index_status: str│  │   chunk_count: int │  │   library_id: UUID │       │
│  │   metadata: dict   │  │   created_at: dt   │  │   metadata: dict   │       │
│  │   created_at: dt   │  │   updated_at: dt   │  │   created_at: dt   │       │
│  │   updated_at: dt   │  └────────────────────┘  └────────────────────┘       │
│  └────────────────────┘                                                         │
│                                                                                  │
│  ┌────────────────────┐  ┌────────────────────┐  ┌────────────────────┐       │
│  │    search.py       │  │ metadata_filter.py │  │   pagination.py    │       │
│  ├────────────────────┤  ├────────────────────┤  ├────────────────────┤       │
│  │ SearchRequest      │  │ MetadataFilter     │  │ PaginationParams   │       │
│  │ SearchResponse     │  │ FilterOperator     │  │ PaginatedResponse  │       │
│  │ SearchResult       │  │ FilterExpression   │  └────────────────────┘       │
│  └────────────────────┘  └────────────────────┘                                │
│                                                                                  │
│   Responsibilities:                                                            │
│  • Define domain entities with validation                                       │
│  • Type safety with Pydantic                                                    │
│  • Automatic serialization/deserialization                                      │
│  • Field constraints and business rules                                         │
│  • No dependencies on other layers (pure domain)                                │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────┐
│                   CROSS-CUTTING CONCERNS: INDEXING ALGORITHMS                   │
│                          (Used by Index Service)                                │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐      │
│  │                        base.py (Abstract Interface)                   │      │
│  │ ┌──────────────────────────────────────────────────────────────────┐ │      │
│  │ │  class BaseIndex(ABC):                                            │ │      │
│  │ │    @abstractmethod                                                │ │      │
│  │ │    def build(vectors: List[Tuple[UUID, List[float]]]) -> None    │ │      │
│  │ │    @abstractmethod                                                │ │      │
│  │ │    def search(query: List[float], k: int) -> List[SearchResult]  │ │      │
│  │ │    @abstractmethod                                                │ │      │
│  │ │    def add(chunk_id: UUID, vector: List[float]) -> None          │ │      │
│  │ │    @abstractmethod                                                │ │      │
│  │ │    def remove(chunk_id: UUID) -> bool                             │ │      │
│  │ └──────────────────────────────────────────────────────────────────┘ │      │
│  └──────────────────────────────────────────────────────────────────────┘      │
│                                    │                                             │
│                                    │ implements                                  │
│                    ┌───────────────┴───────────────┐                            │
│                    ▼               ▼               ▼                            │
│  ┌───────────────────────┐  ┌──────────────┐  ┌──────────────────────┐        │
│  │  linear_index.py      │  │kd_tree_index │  │   lsh_index.py       │        │
│  ├───────────────────────┤  ├──────────────┤  ├──────────────────────┤        │
│  │ LinearIndex           │  │ KDTreeIndex  │  │ LSHIndex             │        │
│  │                       │  │              │  │                      │        │
│  │ Time: O(N×D)          │  │ Time: O(logN)│  │ Time: O(N^ρ), ρ<1   │        │
│  │ Space: O(N×D)         │  │      to O(N) │  │ Space: O(L×N×D)      │        │
│  │ Recall: 100%          │  │ Space: O(N×D)│  │ Recall: 90-95%       │        │
│  │                       │  │ Recall: 100% │  │                      │        │
│  │ ✓ Exact search        │  │              │  │ ✓ High dimensions    │        │
│  │ ✓ Simple              │  │ ✓ Fast in low│  │ ✓ Sub-linear search  │        │
│  │ ✗ Poor scalability    │  │   dimensions │  │ ✓ Tunable precision  │        │
│  │                       │  │ ✗ Curse of   │  │ ✗ Approximate        │        │
│  │ Use: <10K vectors     │  │   dimensionlty│  │                      │        │
│  └───────────────────────┘  └──────────────┘  │ Use: 100K+ vectors,  │        │
│                                                │      high-D          │        │
│  ┌──────────────────────┐  ┌───────────────┐  └──────────────────────┘        │
│  │   hnsw_index.py      │  │optimized_     │                                   │
│  ├──────────────────────┤  │linear_index.py│  ┌───────────────────────┐       │
│  │ HNSWIndex            │  ├───────────────┤  │ multiprobe_lsh_       │       │
│  │                      │  │ Optimized     │  │ index.py              │       │
│  │ Time: O(log N)       │  │ Linear        │  ├───────────────────────┤       │
│  │ Space: O(M×N×D)      │  │               │  │ MultiProbe LSH        │       │
│  │ Recall: 95-99%       │  │ + SIMD        │  │                       │       │
│  │                      │  │ + Batching    │  │ + Multi-bucket probe  │       │
│  │ ✓ State-of-the-art   │  │ + NumPy opts  │  │ + Better recall       │       │
│  │ ✓ Best recall/speed  │  └───────────────┘  └───────────────────────┘       │
│  │ ✓ Dynamic inserts    │                                                       │
│  │ ✓ Runtime tunable    │  ┌─────────────────────────────────────────┐        │
│  │ ✗ Complex impl       │  │    index_factory.py                      │        │
│  │ ✗ Higher memory      │  ├─────────────────────────────────────────┤        │
│  │                      │  │ create(type, dimension) → BaseIndex      │        │
│  │ Use: Production!     │  │ recommend_index_type() → IndexType       │        │
│  └──────────────────────┘  └─────────────────────────────────────────┘        │
│                                                                                  │
│   Design Pattern: Strategy Pattern                                            │
│  • Index Service uses BaseIndex interface                                       │
│  • Concrete implementations are interchangeable                                 │
│  • Open/Closed Principle: add new algorithms without modifying existing code    │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────┐
│                    CROSS-CUTTING CONCERNS: INFRASTRUCTURE                       │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐      │
│  │  persistence/persistence_manager.py                                   │      │
│  ├──────────────────────────────────────────────────────────────────────┤      │
│  │  • save_state() - Snapshot to disk                                    │      │
│  │  • load_state() - Restore from disk                                   │      │
│  │  • write_wal_entry() - Write-Ahead Log                                │      │
│  │  • Periodic snapshots (every 5 min)                                   │      │
│  │  • JSON for metadata, NumPy compressed for vectors                    │      │
│  │  • Atomic writes with temp files                                      │      │
│  └──────────────────────────────────────────────────────────────────────┘      │
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐      │
│  │  distributed/distributed_architecture.py                              │      │
│  ├──────────────────────────────────────────────────────────────────────┤      │
│  │  • Leader-Follower pattern                                            │      │
│  │  • Leader election (Raft-like consensus)                              │      │
│  │  • Heartbeat mechanism                                                │      │
│  │  • Async/Sync replication modes                                       │      │
│  │  • Reads: Any node | Writes: Leader only                              │      │
│  │  • Automatic failover on leader failure                               │      │
│  └──────────────────────────────────────────────────────────────────────┘      │
│                                                                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐      │
│  │  core/error_handlers.py                                               │      │
│  ├──────────────────────────────────────────────────────────────────────┤      │
│  │  • Custom domain exceptions                                           │      │
│  │  • HTTP exception mapping                                             │      │
│  │  • Centralized error handling                                         │      │
│  │  • Structured error responses                                         │      │
│  └──────────────────────────────────────────────────────────────────────┘      │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────────────────────┐
│                              DATA FLOW EXAMPLE                                   │
│                   (Search Request: "Find similar documents")                    │
└─────────────────────────────────────────────────────────────────────────────────┘

   1. CLIENT
      │
      │ POST /libraries/{id}/search
      │ { "query_text": "machine learning", "top_k": 5 }
      ▼

   2. API LAYER (search_router.py)
      │
      │ • Validate request with Pydantic
      │ • Extract library_id, query_text, top_k
      ▼
      │ search_service.search_library(lib_id, query_text, top_k)
      ▼

   3. SERVICE LAYER (search_service.py)
      │
      │ A) Get library from repository
      │    library_repo.get(library_id) → Library
      │
      │ B) Generate embedding for query text
      │    embedding_service.get_embedding("machine learning")
      │    └─→ Call Cohere API → [0.123, -0.456, 0.789, ...]
      │
      │ C) Get index for library
      │    index_service.get_index(library_id) → HNSWIndex
      │
      │ D) Search index
      │    index.search(query_vector, k=5)
      │    └─→ HNSW traversal → [chunk1, chunk2, chunk3, chunk4, chunk5]
      │
      │ E) Apply metadata filters (if any)
      │    metadata_filter_service.apply_filters(results, filters)
      │
      │ F) Fetch full chunk data
      │    chunk_repo.get_batch([chunk_ids])
      │
      │ G) Format response
      │    SearchResponse(results=[...], total=5, search_time_ms=12.3)
      ▼

   4. API LAYER
      │
      │ Convert to JSON
      │ Return HTTP 200 with results
      ▼

   5. CLIENT
      │
      │ Receives search results with similarity scores
      └─→ Display to user


┌─────────────────────────────────────────────────────────────────────────────────┐
│                           KEY DESIGN PRINCIPLES                                  │
├─────────────────────────────────────────────────────────────────────────────────┤
│                                                                                  │
│  SOLID PRINCIPLES                                                             │
│                                                                                  │
│  S - Single Responsibility                                                       │
│      Each class has one reason to change                                        │
│      Example: LinearIndex only handles linear search                            │
│                                                                                  │
│  O - Open/Closed                                                                 │
│      Open for extension, closed for modification                                │
│      Example: Add new index types without changing search service               │
│                                                                                  │
│  L - Liskov Substitution                                                         │
│      Subclasses can replace parent classes                                      │
│      Example: Any BaseIndex implementation works in search service              │
│                                                                                  │
│  I - Interface Segregation                                                       │
│      Clients don't depend on unused methods                                     │
│      Example: Search doesn't need to know about index building internals        │
│                                                                                  │
│  D - Dependency Inversion                                                        │
│      Depend on abstractions, not concretions                                    │
│      Example: Service → BaseIndex interface, not specific index class           │
│                                                                                  │
│  ────────────────────────────────────────────────────────────────────────────   │
│                                                                                  │
│   DOMAIN-DRIVEN DESIGN                                                        │
│                                                                                  │
│  • Layered Architecture - Clear separation of concerns                          │
│  • Repository Pattern - Data access abstraction                                 │
│  • Service Layer - Business logic isolation                                     │
│  • Domain Models - Rich, validated entities (Pydantic)                          │
│  • Ubiquitous Language - Consistent terminology (Library, Chunk, Index)         │
│                                                                                  │
│  ────────────────────────────────────────────────────────────────────────────   │
│                                                                                  │
│   CONCURRENCY & THREAD SAFETY                                                  │
│                                                                                  │
│  • async/await throughout (non-blocking I/O)                                    │
│  • asyncio.Lock for shared state access                                         │
│  • Thread-safe repository operations                                            │
│  • FastAPI handles concurrent requests automatically                            │
│                                                                                  │
│  ────────────────────────────────────────────────────────────────────────────   │
│                                                                                  │
│   BEST PRACTICES                                                               │
│                                                                                  │
│  • Type hints everywhere (static typing)                                        │
│  • Pydantic for validation (fail fast)                                          │
│  • Early returns (avoid deep nesting)                                           │
│  • Composition over inheritance                                                 │
│  • No hardcoded values (use constants)                                          │
│  • Comprehensive error handling                                                 │
│  • Unit + Integration testing                                                   │
│                                                                                  │
└─────────────────────────────────────────────────────────────────────────────────┘

------------------------------ TO DO
┌─────────────────────────────────────────────────────────────────────────────────┐
│                           DEPLOYMENT ARCHITECTURE                                │
└─────────────────────────────────────────────────────────────────────────────────┘

                        ┌─────────────────────────┐
                        │      Load Balancer      │
                        └────────────┬────────────┘
                                     │
                 ┌───────────────────┼───────────────────┐
                 │                   │                   │
                 ▼                   ▼                   ▼
         ┌───────────────┐   ┌───────────────┐   ┌───────────────┐
         │  Docker       │   │  Docker       │   │  Docker       │
         │  Container 1  │   │  Container 2  │   │  Container 3  │
         │               │   │               │   │               │
         │ ┌───────────┐ │   │ ┌───────────┐ │   │ ┌───────────┐ │
         │ │ FastAPI   │ │   │ │ FastAPI   │ │   │ │ FastAPI   │ │
         │ │ App       │ │   │ │ App       │ │   │ │ App       │ │
         │ └───────────┘ │   │ └───────────┘ │   │ └───────────┘ │
         │               │   │               │   │               │
         │  Leader       │   │  Follower     │   │  Follower     │
         │  (Writes)     │   │  (Reads)      │   │  (Reads)      │
         └───────┬───────┘   └───────┬───────┘   └───────┬───────┘
                 │                   │                   │
                 └───────────────────┼───────────────────┘
                                     │
                          Replication & Sync
                                     │
                                     ▼
                          ┌──────────────────────┐
                          │   Persistent Storage  │
                          │                      │
                          │  data/               │
                          │  ├─ snapshots/       │
                          │  ├─ wal/             │
                          │  └─ indexes/         │
                          └──────────────────────┘

         Docker Volume: vectordb_data (mounted to /app/data)

Manual Testing Guide - Complete Flow

Once the application is running, navigate to http://localhost:8000/docs to access the interactive Swagger UI.

Step 1: Verify API Health

Click on GET /health
Click "Try it out"
Click "Execute"

Verify the response shows:

{
  "status": "healthy",
  "embedding_service": {
    "provider": "cohere",
    "available": true,
    "dimension": 1024
  }
}

Step 2: Create a Library

Click on POST /libraries
Click "Try it out"

Replace the request body with:

{
  "name": "AI Research Library",
  "description": "Collection of AI and ML documents",
  "index_type": "linear",
  "metadata": {
    "category": "research",
    "owner": "test_user"
  }
}

Click "Execute"
Copy the library_id from the response (you'll need it for next steps)

Example response:

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "name": "AI Research Library",
  "description": "Collection of AI and ML documents",
  "index_type": "linear",
  "metadata": {
    "category": "research",
    "owner": "test_user"
  },
  "created_at": "2024-01-20T10:00:00Z",
  "updated_at": "2024-01-20T10:00:00Z"
}

Step 3: Create a Document

Click on POST /libraries/{library_id}/documents
Click "Try it out"
Enter your library_id in the path parameter

Replace the request body with:

{
  "name": "Introduction to Machine Learning",
  "metadata": {
    "author": "John Doe",
    "year": 2024,
    "topic": "ML Basics"
  }
}

Click "Execute"
Copy the document_id from the response

Step 4: Add Chunks with Text (Embeddings Auto-Generated)

Click on POST /libraries/{library_id}/chunks
Click "Try it out"
Enter your library_id in the path parameter
Add multiple chunks with different texts:

Chunk 1:

{
  "text": "Machine learning is a subset of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed.",
  "document_id": "your-document-id-here",
  "metadata": {
    "chapter": 1,
    "section": "Introduction"
  }
}

Chunk 2:

{
  "text": "Neural networks are computing systems inspired by biological neural networks that constitute animal brains. They are the foundation of deep learning.",
  "document_id": "your-document-id-here",
  "metadata": {
    "chapter": 2,
    "section": "Neural Networks"
  }
}

Chunk 3:

{
  "text": "Natural language processing is a field of AI that gives machines the ability to read, understand, and derive meaning from human language.",
  "document_id": "your-document-id-here",
  "metadata": {
    "chapter": 3,
    "section": "NLP"
  }
}

Execute each chunk creation separately
The API will automatically generate embeddings using Cohere

Step 5: Build an Index (Optional but Recommended)

Click on POST /libraries/{library_id}/index
Click "Try it out"
Enter your library_id in the path parameter
Choose an index type in the request body:
```
{
  "index_type": "kd_tree"
}
```
Options: "linear", "kd_tree", or "lsh"
Click "Execute"

Step 6: Perform Semantic Search

Click on POST /libraries/{library_id}/search
Click "Try it out"
Enter your library_id in the path parameter
Try different search queries:

Example 1 - Text Query:

{
  "query_text": "What is deep learning?",
  "top_k": 5,
  "include_text": true,
  "include_metadata": true
}

Example 2 - Semantic Similarity:

{
  "query_text": "artificial intelligence and neural computation",
  "top_k": 3,
  "include_text": true,
  "min_score": 0.5
}

Example 3 - With Metadata Filters:

{
  "query_text": "machine learning fundamentals",
  "top_k": 5,
  "filters": {
    "chapter": 1
  },
  "include_text": true,
  "include_metadata": true
}

Click "Execute"
Observe the results sorted by similarity score

Expected response structure:

{
  "query": "What is deep learning?",
  "results": [
    {
      "chunk_id": "chunk-uuid",
      "document_id": "doc-uuid",
      "score": 0.8567,
      "text": "Neural networks are computing systems...",
      "metadata": {
        "chapter": 2,
        "section": "Neural Networks"
      }
    }
  ],
  "total_results": 1,
  "search_time_ms": 12.5,
  "index_type": "kd_tree"
}

Step 7: Test Different Index Types

Go back to POST /libraries/{library_id}/index
Rebuild the index with different algorithms:
- "linear" - Brute force search (most accurate)
- "kd_tree" - Tree-based spatial index (fast for low dimensions)
- "lsh" - Locality-Sensitive Hashing (fast approximate search)
Repeat the search queries and compare:
- Search times
- Result accuracy
- Similarity scores -->

Test CRUD Operations

Update a Chunk:

Click on PUT /libraries/{library_id}/chunks/{chunk_id}
Update the text or metadata
The embedding will be automatically regenerated

Delete a Chunk:

Click on DELETE /libraries/{library_id}/chunks/{chunk_id}
Verify deletion with GET /libraries/{library_id}/chunks

List All Libraries:

Click on GET /libraries
See all created libraries with pagination support

Step 9: Advanced Testing Scenarios

Test Bulk Chunk Creation:

Click on POST /libraries/{library_id}/chunks/bulk
Provide multiple texts at once:

{
  "document_id": "your-document-id",
  "texts": [
    "First chunk text about machine learning",
    "Second chunk text about deep learning",
    "Third chunk text about neural networks"
  ]
}

Test Search Without Text (Direct Vector):

First, get an embedding from a chunk
Use POST /libraries/{library_id}/search with:

{
  "query_vector": [0.123, -0.456, 0.789, ...],
  "top_k": 5
}

Performance Testing

Create a library with many chunks (50+)
Compare search performance:
- Without index (linear scan)
- With KD-Tree index
- With LSH index
Monitor response times in the Swagger UI

Expected Results

Successful Library Creation: Returns library with unique ID Chunk Creation: Automatically generates 1024-dimensional embeddings Semantic Search: Returns relevant chunks even with different wording Similarity Scores: Range from 0.0 to 1.0 (higher is more similar) Index Building: Improves search performance for large datasets Metadata Filtering: Correctly filters results based on criteria

Common Test Cases

Semantic Similarity Test:
- Add chunk: "Machine learning is a type of artificial intelligence"
- Search: "AI and ML technologies"
- Expected: High similarity score (>0.6)
Different Topics Test:
- Add chunk: "Python is a programming language"
- Search: "Neural networks and deep learning"
- Expected: Low similarity score (<0.3)
Exact Match Test:
- Add chunk with specific text
- Search with same text
- Expected: Very high similarity score (>0.95)

Troubleshooting

No results returned: Check if chunks were added to the correct library
Low similarity scores: Ensure Cohere API key is set correctly
Index not improving performance: May need more data (>100 chunks) to see benefits
API errors: Check the response details in Swagger UI for specific error messages

API Endpoints Summary

Method	Endpoint	Description
GET	`/`	Root endpoint with API info
GET	`/health`	Health check with service status
POST	`/libraries`	Create a new library
GET	`/libraries`	List all libraries
GET	`/libraries/{id}`	Get library details
PUT	`/libraries/{id}`	Update library metadata
DELETE	`/libraries/{id}`	Delete a library
POST	`/libraries/{id}/documents`	Create a document
GET	`/libraries/{id}/documents/{doc_id}`	Get document details
POST	`/libraries/{id}/chunks`	Add a chunk
POST	`/libraries/{id}/chunks/bulk`	Add multiple chunks
GET	`/libraries/{id}/chunks`	List chunks in library
POST	`/libraries/{id}/index`	Build/rebuild index
POST	`/libraries/{id}/search`	Perform vector search

Testing with cURL

If you prefer command-line testing:

# Create a library
curl -X POST "http://localhost:8000/libraries" \
  -H "Content-Type: application/json" \
  -d '{"name": "Test Library", "index_type": "linear"}'

# Add a chunk (replace library-id and document-id)
curl -X POST "http://localhost:8000/libraries/{library-id}/chunks" \
  -H "Content-Type: application/json" \
  -d '{"text": "Sample text", "document_id": "{document-id}"}'

# Search
curl -X POST "http://localhost:8000/libraries/{library-id}/search" \
  -H "Content-Type: application/json" \
  -d '{"query_text": "Sample query", "top_k": 5}'

Running Automated Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run specific test file
pytest tests/test_api_integration.py -v

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vector Database REST API

Quick Setup

Prerequisites

1. Build the Docker Image

2. Start the Vector Database

3. Verify it's Running

4. Access the API

5. Quick API Test

6. Access Swagger UI

7. Follow the Manual Testing Guide

Architecture Diagram

Complete System Architecture (Domain-Driven Design)

Manual Testing Guide - Complete Flow

Step 1: Verify API Health

Step 2: Create a Library

Step 3: Create a Document

Step 4: Add Chunks with Text (Embeddings Auto-Generated)

Step 5: Build an Index (Optional but Recommended)

Step 6: Perform Semantic Search

Step 7: Test Different Index Types

Test CRUD Operations

Update a Chunk:

Delete a Chunk:

List All Libraries:

Step 9: Advanced Testing Scenarios

Test Bulk Chunk Creation:

Test Search Without Text (Direct Vector):

Performance Testing

Expected Results

Common Test Cases

Troubleshooting

API Endpoints Summary

Testing with cURL

Running Automated Tests

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Vector Database REST API

Quick Setup

Prerequisites

1. Build the Docker Image

2. Start the Vector Database

3. Verify it's Running

4. Access the API

5. Quick API Test

6. Access Swagger UI

7. Follow the Manual Testing Guide

Architecture Diagram

Complete System Architecture (Domain-Driven Design)

Manual Testing Guide - Complete Flow

Step 1: Verify API Health

Step 2: Create a Library

Step 3: Create a Document

Step 4: Add Chunks with Text (Embeddings Auto-Generated)

Step 5: Build an Index (Optional but Recommended)

Step 6: Perform Semantic Search

Step 7: Test Different Index Types

Test CRUD Operations

Update a Chunk:

Delete a Chunk:

List All Libraries:

Step 9: Advanced Testing Scenarios

Test Bulk Chunk Creation:

Test Search Without Text (Direct Vector):

Performance Testing

Expected Results

Common Test Cases

Troubleshooting

API Endpoints Summary

Testing with cURL

Running Automated Tests