Semantic Memory Service

This microservice provides long-term temporal and semantic context for the Cognitive Companion system. It uses PostgreSQL with pgvectorscale (StreamingDiskANN) for high-performance vector similarity search over CLIP embeddings, structured scene observations, and person movement transitions.

Features

Semantic Search: Vector similarity searches using CLIP and text embeddings (via Triton Inference Server)
Temporal Context: Query observations and movements within specific time windows
Object Tracking: Aggregate and track the "last seen" status of objects in specific rooms
Movement Inference: Track semantic room transitions (entering/exiting) for person identification and activity enrichment
Retention Management: Automated data pruning based on configurable retention policies
High-Performance: Built with FastAPI, asyncpg for asynchronous database access, and uv for dependency management

Tech Stack

Language: Python 3.14+
Package Manager: uv
Framework: FastAPI
Database: PostgreSQL + pgvectorscale (StreamingDiskANN indexes)
Migrations: Alembic with async psycopg3
Linting/Formatting: Ruff
Static Analysis: Mypy
Testing: Pytest + pytest-asyncio
Containerization: Docker

Getting Started

Prerequisites

Python 3.14+
uv
Docker & Docker Compose
PostgreSQL with pgvectorscale extension

Local Development

Install dependencies:
```
uv sync
```
Run with Docker Compose:
```
docker compose up --build
```
The API will be available at http://localhost:8400.
Run tests:
```
uv run pytest
```

Linting and type checking:

uv run ruff check .
uv run mypy app/ --ignore-missing-imports --explicit-package-bases

API Documentation

Once the service is running, interactive Swagger documentation is available at: http://localhost:8400/docs

API Endpoints

Observations

Method	Endpoint	Description
POST	`/api/v1/observations/`	Create a new scene observation
POST	`/api/v1/observations/search`	Search observations using vector similarity
DELETE	`/api/v1/observations/prune`	Prune observations older than N days

Movements

Method	Endpoint	Description
POST	`/api/v1/movements/`	Create a new movement record
GET	`/api/v1/movements/transitions`	Get movement transitions for a person

Objects

Method	Endpoint	Description
GET	`/api/v1/objects/{room_id}/recent`	Get recent object presence in a room

Health

Method	Endpoint	Description
GET	`/health`	Health check endpoint

Error Handling

The service uses standard HTTP status codes and returns errors in a consistent format:

{
  "detail": "Error message describing what went wrong"
}

Status Codes

200 OK: Successful request
201 Created: Resource successfully created
400 Bad Request: Invalid input or store operation failed
422 Unprocessable Entity: Request validation error
500 Internal Server Error: Service-level error (e.g., search failure)

Architecture

The service follows a layered architecture:

API Layer (Routers): RESTful endpoints with request validation and error handling
Service Layer (Stores/Services): Business logic and database operations
Data Access Layer (Connection): PostgreSQL connection pooling via psycopg3
Persistence Layer (PostgreSQL): Structured and vector data using pgvectorscale

Lifecycle Management

The application uses FastAPI's lifespan context manager to handle:

Database connection initialization on startup
Automatic Alembic migration on startup
Graceful shutdown with connection cleanup

Configuration

Configuration is managed via pydantic-settings:

DATABASE_URL: PostgreSQL connection string (default: postgresql://postgres:postgres@localhost:5432/semantic_memory)
API_V1_STR: API version prefix (default: /api/v1)
PROJECT_NAME: Service name for API documentation (default: semantic-memory-service)
RETENTION_DAYS: Default data retention period (default: 90)
TEXT_EMBEDDING_ENABLED: Enable text embedding fallback (default: true)
TRITON_URL: Triton Inference Server gRPC endpoint (default: localhost:8701)
TRITON_TEXT_EMBEDDING_MODEL: Triton model name for text embeddings (default: embeddinggemma-300m)
TRITON_TEXT_EMBEDDING_TOKENIZER_PATH: Path to the tokenizer.json for the embedding model

Database Migrations

Database schema changes are managed with Alembic. Migrations run automatically on application startup.

Creating a new migration

uv run alembic revision -m "description of the change"

This generates a new file in app/db/alembic/versions/. Fill in the upgrade() and downgrade() methods.

To apply migrations manually:

uv run alembic upgrade head

Schema

scene_observations: Scene observations with CLIP and text embeddings
person_movements: Person movement transitions between rooms
object_presence: Object presence aggregated by room
person_current_location: Materialized view for current person locations

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
app		app
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Semantic Memory Service

Features

Tech Stack

Getting Started

Prerequisites

Local Development

API Documentation

API Endpoints

Observations

Movements

Objects

Health

Error Handling

Status Codes

Architecture

Lifecycle Management

Configuration

Database Migrations

Creating a new migration

Schema

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Semantic Memory Service

Features

Tech Stack

Getting Started

Prerequisites

Local Development

API Documentation

API Endpoints

Observations

Movements

Objects

Health

Error Handling

Status Codes

Architecture

Lifecycle Management

Configuration

Database Migrations

Creating a new migration

Schema

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages