MMMeta

MMMeta is a local-first, production-oriented multimedia metadata generation platform for ingesting large media archives, normalizing discoverable metadata into structured JSON, enriching assets with derivative AI metadata, indexing artifacts for search, and exporting interoperable archives.

Highlights

FastAPI API server with OpenAPI docs, health endpoints, metrics, API key auth, optional JWT auth, and WebSocket job progress streaming
Typer CLI with ingest, process, watch, search, export, validate, providers, pipelines, jobs, embeddings, subtitles, summarize, transcribe, analyze, config, migrate, stats, dedupe, and benchmark commands
Async SQLAlchemy 2.x persistence with SQLite by default and PostgreSQL support through configuration
Resumable job queue with persistent state, retries, cancellation, incremental processing, content hashing, and deduplication
Extensible plugin SDK for custom extractors, providers, exporters, vector stores, pipelines, and enrichment steps
Search stack with SQLite FTS5 plus vector similarity abstraction
Local artifact storage with content-addressable layout
Built-in subtitle parsing, JSON metadata mapping, image metadata extraction, audio heuristics, and video sidecar inspection
Docker, Compose, Alembic, tests, sample plugin, sample data, and React/Vite/Tailwind UI

Quick Start

python -m venv .venv
.venv\Scripts\activate
pip install -e .[dev,parquet]
copy .env.example .env
mmmeta config show
mmmeta migrate
mmmeta ingest examples\sample_data
mmmeta process run
mmmeta api serve --host 127.0.0.1 --port 8080

Open:

API docs: http://127.0.0.1:8080/docs
Metrics: http://127.0.0.1:8080/metrics

Repository Layout

src/mmmeta/
  api/           FastAPI application and routers
  cli/           Typer CLI
  core/          Configuration, logging, metrics, utilities
  db/            Engine, sessions, initialization helpers
  exporters/     JSON, NDJSON, CSV, Parquet, Markdown exporters
  extractors/    Metadata extraction and parsing modules
  legacy/        Adapters for existing subtitle workflows
  models/        SQLAlchemy ORM models
  pipelines/     DAG pipeline framework and built-ins
  plugins/       SDK and plugin discovery
  providers/     OpenAI-compatible provider abstraction
  schemas/       Pydantic v2 models and JSON schema exports
  security/      API key, JWT, rate limiting
  storage/       Artifact storage backends
  vectorstores/  Embedding storage and similarity search
  workers/       Persistent queue execution

Core Workflows

ingest scans directories recursively, hashes supported files, records assets, and captures sidecar metadata.
process run executes the built-in pipeline graph against queued assets.
search performs full-text, semantic, or hybrid search over normalized metadata.
export writes normalized datasets as JSON, NDJSON, CSV, Parquet, Markdown, or SQLite snapshots.
watch monitors directories and auto-enqueues new or changed assets.

Legacy Compatibility

The original convert_srts_to_metadata.py flow is preserved as a legacy adapter. The new system can ingest the same SRT and .info.json companion layout while routing derivative generation through the configurable provider layer.

Running Tests

ruff check .
mypy src
pytest

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
configs		configs
data		data
docs		docs
examples		examples
migrations		migrations
plugins/sample_enrichment		plugins/sample_enrichment
scripts		scripts
src		src
tests		tests
webui		webui
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
convert_srts_to_metadata (2).py		convert_srts_to_metadata (2).py
convert_srts_to_metadata.py		convert_srts_to_metadata.py
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
mypy.ini		mypy.ini
prompt-general.txt		prompt-general.txt
prompt.txt		prompt.txt
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
ruff.toml		ruff.toml
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MMMeta

Highlights

Quick Start

Repository Layout

Core Workflows

Legacy Compatibility

Running Tests

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MMMeta

Highlights

Quick Start

Repository Layout

Core Workflows

Legacy Compatibility

Running Tests

Documentation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages