Discover underexplored biomedical datasets through transparent, deterministic scoring.
A scientific instrument for finding GEO, SRA, Zenodo, ENA, HCA, Expression Atlas, and Open Targets datasets that deserve a second look — local-first, BYOK, fully auditable.
Documentation · Changelog · Roadmap · ADRs · Security · Contributing
Demo video: <!-- VIDEO_URL --> · recording recategorised P1 → P2 in ADR-0004
GEO and SRA together index over 1.4 million biomedical datasets. The vast majority are cited fewer than five times. Researchers re-analyse the same handful of famous accessions because the ones with quiet potential are buried by a flat search interface, by sparse metadata, and by the cost of triaging hundreds of candidates by hand.
Meta Omix scores every dataset it ingests on six deterministic components — reuse potential, citation underexploration, disease relevance, metadata completeness, data-type leverage, recency — and surfaces the under-cited datasets where the science is strong, the metadata is rich, and a re-analysis would actually move the needle.
The scoring formula is calibrated against a curated ground truth (Pearson r = 0.92, Spearman r = 0.90), versioned in ADR-0001, and fully reproducible offline.
The score is the heart. The LLM is decoration.
Cloud LLMs are an opt-in narrative layer that explains why a score is what it is — they never decide what the score is.
- Highlights
- Quickstart — one command
- What
docker compose upactually does - Architecture
- Service map
- Scoring methodology
- Data sources
- BYOK and AI providers
- Pipelines
- Observability
- Configuration
- Development
- Testing
- Deployment
- Supply chain
- Roadmap
- Contributing
- Governance
- License
- Citation
- Acknowledgements
- 🧬 Deterministic 6-component scoring — calibrated via L2-regularised grid search against 51 ground-truth datasets. Pearson r = 0.92, Spearman r = 0.90. ADR-versioned.
- 🌐 10 ingestion sources wired with circuit breakers, retries, and per-source rate limits — GEO, PubMed, iCite, SRA, Zenodo, Europe PMC, ENA, HCA, Expression Atlas, Open Targets.
- 🔐 BYOK with AES-GCM 256 + Argon2id master-password derivation; keys never leave your machine. Nine cloud providers + Ollama local. Provenance recorded for every model call.
- 🧪 Three Snakemake pipelines out of the box: bulk RNA-seq (STAR + Salmon + DESeq2), single-cell scRNA (scanpy), microbiome diversity (DADA2 + scikit-bio).
- 📐 22 UI routes, WCAG 2.2 AA, Lighthouse a11y ≥ 95 enforced in CI; design tokens cover dark + light + density-compact.
- 🔭 Observability built-in — OpenTelemetry traces, Prometheus metrics (
metaomix_*counters / gauges / histograms), Loki + Promtail + Grafana dashboards. - 🛡️ Supply-chain hardened — multi-arch Docker images, SBOM signed with cosign, attestations, dependabot, CodeQL, gitleaks, Trivy, OSV-Scanner, dependency-review.
- 🧾 Provenance everywhere — every LLM-generated artefact carries
provider,model,seed,prompt_hash,response_hash,fallback_used,fallback_reason. - 🌱 Local-first by default —
docker compose up, no cloud account required. The platform is fully usable without a single API key. - ⚙️ Zero-touch bootstrap — Postgres extensions, Alembic migrations, MinIO bucket, Ollama models, all set up automatically on the first
up. No manual follow-up commands. - 📜 Apache 2.0 — explicit patent grant, institution-friendly, single-maintainer governance.
git clone https://github.com/edgarzzin/meta-omix.git
cd meta-omix
docker compose up -dThat's it. Sixty seconds later open http://localhost:3000.
The defaults in docker-compose.yml are designed so the first up produces a fully functional stack:
- Postgres extensions installed (
infra/postgres-init/00-extensions.sql). - Alembic migrated to
headautomatically by theapi-migrateinit container. - MinIO bucket created and made publicly readable for object downloads (
minio-init). - Ollama bootstrapped with the default chat model (
qwen2.5:7b) and the default embedding model (nomic-embed-text) (ollama-init). - Eight long-running services healthy: postgres, redis, minio, ollama, api, worker, beat, flower, web.
- Three one-shot init services exit cleanly:
api-migrate,minio-init,ollama-init.
If you want to override anything, copy the env template:
cp .env.example .env
$EDITOR .env # toggle ENABLE_TELEMETRY, point OLLAMA at a host install, …
docker compose up -d # picks up .env automaticallyTo skip the Ollama auto-pull (e.g. on a bandwidth-constrained machine):
OLLAMA_AUTO_PULL=false docker compose up -dTo stop everything:
docker compose down # stop containers, keep volumes
docker compose down -v # stop AND wipe volumes (postgres, minio, ollama models)The bring-up is choreographed through depends_on health and completion gates so the human never has to follow up with manual commands. Top to bottom:
1. postgres ──► healthy (pg_isready)
2. redis ─────► healthy (redis-cli ping)
3. minio ─────► healthy (mc ready local)
4. ollama ────► healthy (ollama list)
5. api-migrate ──► (depends on postgres) ──► alembic upgrade head ──► exit 0
6. minio-init ───► (depends on minio) ──► create bucket + policy ──► exit 0
7. ollama-init ──► (depends on ollama) ──► pull default models ──► exit 0
8. api ──► (depends on postgres + redis + minio + api-migrate + minio-init)
──► uvicorn → listening on :8000
9. worker ──► (depends on api + api-migrate)
──► celery -A app.workers.celery_app worker
10. beat ────► (depends on worker)
──► celery beat with RedBeat scheduler
11. flower ──► (depends on redis)
──► port 5556 (browser) / 5555 (internal)
12. web ──────► (depends on api healthy)
──► next start → listening on :3000
docker compose up -d returns when every long-running service is healthy. The init containers do not block subsequent up invocations because Alembic, mc mb, and ollama pull are all idempotent.
graph LR
classDef source fill:#eef4ff,stroke:#3a6fd6,color:#0e1f44;
classDef api fill:#f5f6f8,stroke:#404750,color:#111316;
classDef store fill:#e1dbf6,stroke:#5a48bd,color:#150f37;
classDef ai fill:#f3f0fb,stroke:#5a48bd,color:#322777;
classDef ui fill:#eef4ff,stroke:#3a6fd6,color:#0e1f44;
classDef obs fill:#dcecfa,stroke:#1d7387,color:#0a3a48;
subgraph Sources [Public scientific archives]
GEO[GEO]:::source
PubMed[PubMed]:::source
iCite[iCite]:::source
SRA[SRA]:::source
Zenodo[Zenodo]:::source
EPMC[Europe PMC]:::source
ENA[ENA]:::source
HCA[HCA Data Portal]:::source
GXA[Expression Atlas]:::source
OT[Open Targets]:::source
end
subgraph Backend [Meta Omix backend - FastAPI]
Ingest[Ingestion service]:::api
Score[Scoring engine]:::api
NLP[NER + ontology mapping]:::api
AI[Provider registry]:::api
end
subgraph Data [Persistence]
PG[(PostgreSQL + pgvector)]:::store
Redis[(Redis)]:::store
MinIO[(MinIO / S3)]:::store
DuckDB[(DuckDB)]:::store
end
subgraph Local [Local LLM]
Ollama((Ollama)):::ai
end
subgraph Cloud [Optional BYOK cloud providers]
CloudAI[Anthropic / OpenAI / Bedrock<br/>Azure / Groq / Mistral<br/>Gemini / Cohere / Voyage]:::ai
end
subgraph UI [Frontend - Next.js 15]
Web[22 routes<br/>Server + Client Components]:::ui
end
subgraph Obs [Observability]
OTEL[OpenTelemetry Collector]:::obs
Prom[Prometheus]:::obs
Loki[Loki]:::obs
Graf[Grafana]:::obs
end
Sources -->|HTTP / Entrez / GraphQL| Ingest
Ingest --> PG
Score --> PG
NLP --> PG
AI --> Ollama
AI -.opt-in.-> CloudAI
Backend --> Redis
Backend --> MinIO
Backend --> DuckDB
Web -->|/api/v1/*| Backend
Backend --> OTEL
OTEL --> Prom
OTEL --> Loki
Prom --> Graf
Loki --> Graf
Detailed architecture documentation lives in apps/docs/docs/architecture.md.
Twelve services run from docker-compose.yml. The three init services exit with code 0; the rest stay up.
| Service | Image | Purpose | Port (host bind) | Lifecycle |
|---|---|---|---|---|
postgres |
pgvector/pgvector:pg16 |
Primary store + vector search (HNSW index) | internal only | long-running |
redis |
redis:7-alpine |
Cache + Celery broker (DB 0/1/2) + RedBeat scheduler | internal only | long-running |
minio |
minio/minio |
S3-compatible object store for reports, sample-matrix cache, notebooks, pipeline outputs | console 127.0.0.1:9101 |
long-running |
ollama |
ollama/ollama:0.5.4 |
Local LLM runtime (default Meta Omix AI provider) | 127.0.0.1:11434 |
long-running |
api |
metaomix/api:dev |
FastAPI backend, 55 OpenAPI paths, SSE for ingestion events | 127.0.0.1:8000 |
long-running |
worker |
metaomix/api:dev |
Celery worker (queues: ingestion, ai-cpu, ai-cloud, reports, maintenance) | internal only | long-running |
beat |
metaomix/api:dev |
RedBeat scheduler — nightly refresh, weekly rescore, nightly Postgres backup | internal only | long-running |
flower |
mher/flower:2.0 |
Celery dashboard, embedded as a sandboxed iframe in /operations/jobs |
127.0.0.1:5556 |
long-running |
web |
metaomix/web:dev |
Next.js 15 frontend, 22 routes | 127.0.0.1:3000 |
long-running |
api-migrate |
metaomix/api:dev |
One-shot: alembic upgrade head |
— | exits 0 |
minio-init |
minio/mc |
One-shot: mc mb + mc anonymous set download |
— | exits 0 |
ollama-init |
ollama/ollama |
One-shot: ollama pull for chat + embed default models |
— | exits 0 |
Optional observability stack lives in docker-compose.observability.yml (grafana, loki, promtail, prometheus, otel-collector) and is brought up with docker compose -f docker-compose.observability.yml up -d.
Each dataset receives a single opportunity score in [0, 100], the weighted sum of six components:
| Component | Weight | What it captures |
|---|---|---|
| Reuse potential | 0.22 | Are samples deep enough, raw enough, and licensed permissively enough that re-analysis is feasible? |
| Citation underexploration | 0.20 | Is this dataset cited fewer times than its sample size + recency would predict? |
| Disease relevance | 0.16 | Is the disease a high-burden, high-research-priority area (NIH categorical spending + DALYs)? |
| Metadata completeness | 0.16 | How rich is the metadata (10 weighted fields, scored via ADR-0002)? |
| Data-type leverage | 0.14 | RNA-seq / scRNA-seq / microbiome / multi-omics weighted by analysis-tooling maturity. |
| Recency | 0.12 | Decay function favouring datasets recent enough to use modern reference genomes / catalogues. |
Calibration. Weights were fit by L2-regularised grid search against a 51-dataset ground truth curated by domain experts, with the regularisation prior keeping the weights close to a documented prior in apps/api/app/services/scoring/weights.py. Pearson r = 0.92, Spearman r = 0.90 against the held-out fold. Methodology and results live in ADR-0001. The full per-component breakdown is documented in docs/scoring.md.
Reproducibility. The calibration data lives at data/{ground_truth.csv, dataset_metadata.csv, calibration_report.json}. The calibration script is apps/api/app/scripts/calibrate_weights.py. Re-running it from a clean checkout reproduces the published weights.
| Source | Tier | Type | Example accession |
|---|---|---|---|
| GEO — NCBI Gene Expression Omnibus | T1 | Bulk + scRNA + microarray | GSE12345 |
| PubMed — bibliographic enrichment | T1 | Citations | PMID:34567890 |
| iCite — NIH citation metrics | T1 | Citations | iCite:34567890 |
| SRA — Sequence Read Archive | T2 | Raw reads | SRP123456 / PRJNA234567 |
| Zenodo — open scientific repository | T2 | Reanalysis bundles | zenodo:7654321 |
| Europe PMC — fallback for PubMed | T2 | Citations | PMC1234567 |
| ENA — European Nucleotide Archive | T3 | Raw reads | PRJEB12345 / ERX234567 |
| HCA Data Portal — Human Cell Atlas | T3 | Single-cell | <project-uuid> |
| Expression Atlas — EBI processed RNA-seq | T3 | Processed counts | E-MTAB-100 |
| Open Targets — gene-disease evidence | T3 | Genetic associations | EFO_0000249 |
Roadmap (P2): Reactome, UniProt, ArrayExpress / BioStudies, Figshare. Tracked in BACKLOG.md §1.
Meta Omix uses LLMs as decoration, not as the scoring engine. Providers are pluggable, the local Ollama path is the default, and no key ever leaves the user's machine unless the user explicitly configures the provider through the Settings UI.
- At rest: AES-GCM 256, master key derived from the OS keychain (
keyring) by default, or via Argon2id from a master password. - In transit: HTTPS to the upstream provider's API.
- In memory: kept only for the duration of a single request, never logged.
- Audit log: every secret event (
secret.created,secret.updated,secret.rotated,secret.deleted,secret.used) recorded inaudit_logs.
Supported providers, all opt-in, all replaceable:
| Provider | Tasks | Notes |
|---|---|---|
| Ollama (local) | summarise / embed | Default. No key required. |
| Anthropic | summarise / report | Claude Opus / Sonnet / Haiku |
| OpenAI | summarise / embed / report | GPT family |
| Azure OpenAI | summarise / embed / report | endpoint + deployment_name + api_version |
| AWS Bedrock | summarise / report | aws_access_key_id + secret + region |
| Groq | summarise | Llama / Mixtral fast inference |
| Mistral | summarise / embed | api.mistral.ai |
| Google Gemini | summarise / embed | Generative Language API |
| Cohere | summarise / embed / rerank | rerank used by hybrid search |
| OpenAI-compatible | summarise / embed | catch-all for self-hosted / Together / DeepInfra / Fireworks |
A provider fallback chain ensures graceful degradation: if a configured cloud provider fails, the registry falls back to the next available one, eventually to Ollama, and the response carries fallback_used = true plus a fallback_reason for full transparency. See apps/api/app/ai/providers/registry.py.
Three Snakemake pipelines are bundled and orchestrated through the /pipelines/[kind] UI route. Each is fully reproducible (--use-conda) and writes outputs to MinIO under pipelines/<kind>/<run-timestamp>/.
pipelines/snakemake/rnaseq/— bulk RNA-seq. FastQC → fastp trim → STAR alignment → Salmon quantification → pydeseq2 differential expression → MultiQC aggregation.pipelines/snakemake/scrna/— single-cell tutorial pipeline. Downloads PBMC 3k from 10x Genomics, runs scanpy QC → normalise → PCA → neighbours → Leiden → UMAP → marker genes.pipelines/snakemake/microbiome/— diversity metrics. DADA2 (R) feature-table prep, scikit-bio Shannon / Chao1 / observed_otus / Bray-Curtis.
Run history is persisted in the pipeline_runs Postgres table and surfaced in the UI through /pipelines/[kind]. The DAG itself is rendered live from the Snakefile by an in-house parser (apps/web/lib/snakefile-parser.ts) into a React Flow canvas with mini-map and per-rule popovers.
docker compose -f docker-compose.observability.yml up -d
# Grafana: http://localhost:3001 (admin / admin)
# Prometheus: http://localhost:9090
# Loki: http://localhost:3100Three dashboards are provisioned out of the box:
| Dashboard | Source | What it shows |
|---|---|---|
| API health | OTEL traces + Prometheus | request rate, P95 latency, error rate, BYOK fallback rate |
| Ingestion | metaomix_ingestion_* counters |
per-source records ingested, circuit-breaker state, retry counts |
| AI providers | metaomix_ai_chat_* + metaomix_embedding_cache_* |
calls / 24 h, success rate, latency P95, cache hit rate |
Custom metrics are exported under the metaomix_ prefix; see apps/api/app/observability/metrics.py.
Every variable below is optional — defaults work end-to-end. Override only what you need by writing to a .env file in the repo root (Compose picks it up automatically).
| Variable | Default | Purpose |
|---|---|---|
ENV |
development |
development / staging / production / test |
LOG_LEVEL |
INFO |
DEBUG / INFO / WARNING / ERROR |
SECRET_KEY |
change-me-… |
App secret. Override in production. |
API_HOST / API_PORT |
0.0.0.0 / 8000 |
API bind |
API_BASE_URL |
http://localhost:8000 |
Used by the web app and by webhooks |
API_CORS_ORIGINS |
http://localhost:3000,… |
Comma-separated origins |
DATABASE_URL |
postgresql+asyncpg://metaomix:metaomix-local@postgres:5432/metaomix |
Postgres + pgvector |
ALEMBIC_DATABASE_URL |
psycopg sync DSN | Used by api-migrate only |
REDIS_URL |
redis://redis:6379/0 |
Cache + Celery broker (DB 1, 2 for queues) |
STORAGE_BACKEND |
minio |
minio / s3 / r2 / b2 / azure / gcs / local_fs |
MINIO_ENDPOINT / MINIO_BUCKET |
minio:9000 / metaomix |
Object storage (default) |
MINIO_CONSOLE_PORT |
9101 |
Host port for the MinIO web console |
OLLAMA_BASE_URL |
http://ollama:11434 |
Local LLM endpoint |
OLLAMA_DEFAULT_MODEL |
qwen2.5:7b |
Chat-capable model fallback |
OLLAMA_DEFAULT_EMBED_MODEL |
nomic-embed-text |
Default embedding model |
OLLAMA_AUTO_PULL |
true |
Toggle for the ollama-init bootstrap |
EMBEDDINGS_BACKEND |
local |
local / ollama / openai / voyage / cohere / mistral |
EMBEDDINGS_MODEL |
pritamdeka/S-PubMedBert-MS-MARCO |
Default sentence-transformers model |
ENABLE_AUTH |
false |
Enable API-key middleware (multi-user is roadmap) |
ENABLE_TELEMETRY |
false |
OpenTelemetry exporter switch |
SENTRY_DSN |
unset | Optional error tracking |
OTEL_EXPORTER_OTLP_ENDPOINT |
http://otel-collector:4318/v1/traces |
If telemetry on |
NCBI_API_KEY |
unset | Boost NCBI rate limits 3 → 10 req / s |
NCBI_EMAIL |
unset | Required by NCBI Entrez |
RATE_LIMIT_PER_MINUTE |
120 |
Default rate limit |
RATE_LIMIT_INGEST_PER_MINUTE |
5 |
Ingestion endpoint rate limit |
FLOWER_URL |
http://flower:5555 |
Internal URL the API uses to probe Flower |
FLOWER_EXTERNAL_URL |
http://localhost:5556 |
Browser-facing URL embedded by /operations/jobs |
FLOWER_PORT |
5556 |
Host port for Flower |
Full template lives in .env.example.
| Tool | Version | Notes |
|---|---|---|
| Docker | 24+ | Tested with Docker Engine 24-28 + Compose v2 |
| Docker Compose | v2 | bundled with modern Docker Desktop / Linux installs |
| Node | 22+ | .nvmrc pins the LTS |
| pnpm | 9+ | corepack prepare pnpm@9.15.0 --activate |
| Python | 3.12 | pinned in .python-version and pyproject.toml |
| uv | 0.5+ | curl -LsSf https://astral.sh/uv/install.sh | sh |
| mise (optional) | latest | resolves all of the above from .mise.toml |
.
├── apps/
│ ├── api/ # FastAPI backend
│ ├── web/ # Next.js 15 frontend
│ └── docs/ # Docusaurus documentation site
├── pipelines/
│ ├── notebooks/ # Jupyter starters per data type
│ └── snakemake/ # rnaseq, scrna, microbiome
├── infra/
│ ├── docker/ # Dockerfiles
│ ├── observability/ # Grafana + Loki + Promtail provisioning
│ └── postgres-init/ # Schema bootstrap (extensions)
├── docs/
│ └── adr/ # Architecture Decision Records
├── packages/ # shared-types + ui (workspace stubs for future expansion)
└── data/ # Provisional ground truth + calibration report
# Bring up infra + auto-reloading API + Next dev server (still one command)
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
pnpm dev
# Backend tests with coverage
( cd apps/api && uv run pytest --cov=app --cov-fail-under=80 )
# Frontend tests with watch mode
pnpm --filter @metaomix/web test:watch
# Storybook
pnpm --filter @metaomix/web storybook
# Codegen drift check (after changing API routes)
pnpm --filter @metaomix/web codegen:apipnpm exec lefthook install
pre-commit installThe metaomix-cli lives inside the api container. The platform is fully usable through the UI without invoking it by hand, but you can still drive it directly:
docker compose exec api uv run metaomix-cli ingest geo --term "alzheimer" --max-results 25
docker compose exec api uv run metaomix-cli score-all
docker compose exec api uv run metaomix-cli embed-papers
docker compose exec api uv run metaomix-cli backupThe same surface is exposed through the UI at /operations/ingestion, /operations/jobs, and the /pipelines/[kind] pages.
| Layer | Tool | Coverage / threshold |
|---|---|---|
| Backend unit + integration | pytest + hypothesis + respx | --cov-fail-under=80 |
| Frontend unit | vitest + @testing-library/react | ≥ 70 % statements / ≥ 50 % branches |
| Frontend component a11y | Storybook + axe-core | 0 critical violations |
| End-to-end | Playwright | 12 flows; runs against docker compose up in CI |
| Visual regression | Playwright snapshots | maxDiffPixelRatio: 0.02 over 22 routes |
| Page-level a11y / perf | Lighthouse CI | a11y ≥ 95, perf ≥ 95 over 25 paths |
| Property-based | Hypothesis | invariant 0 ≤ score ≤ 100, idempotency of normalisation |
| OpenAPI drift | openapi-typescript + codegen-check |
CI fails if generated types diverge |
| Bundle budget | custom node script | entry chunks < 130 KB gzipped; Plotly / React Flow / React-PDF / Monaco out of entry |
Official Docker images are built and signed by CI:
ghcr.io/edgarzzin/meta-omix-api:0.2.0 # multi-arch: amd64 + arm64
ghcr.io/edgarzzin/meta-omix-web:0.2.0 # multi-arch: amd64 + arm64
The reference deployment is the docker-compose.yml shipped in this repo; the same automation that runs locally (api-migrate, minio-init, ollama-init) applies on a server. A Helm chart for Kubernetes is on the roadmap (BACKLOG.md BE-INFRA-041).
Meta Omix takes supply-chain hardening seriously. Every release ships:
- Multi-arch images (
amd64,arm64) built reproducibly frominfra/docker/. - SBOM in SPDX-JSON format generated by
syft, signed withcosign. - Build attestations (SLSA provenance) emitted by GitHub Actions.
- Dependabot updates for pip / npm / docker / github-actions weekly.
- CodeQL for Python and JavaScript with
security-and-qualityquery pack. - gitleaks on every push and PR.
- Trivy filesystem + image scans for known CVEs.
- OSV-Scanner for advisory data crossed against
pnpm-lock.yamlandapps/api/uv.lock. - dependency-review action blocking GPL-incompatible licenses on PR.
- DCO check enforcing
Signed-off-byon every commit.
CI matrix and per-job permissions live in .github/workflows/. Branch protection on main is documented in .github/BRANCH_PROTECTION.md.
The full backlog is the source of truth: BACKLOG.md.
- 0 P0 open (all v0.1 blockers shipped).
- 0 P1 open (Sessão 4 closed all 35 — see the audit at the top of
BACKLOG.md). - 65 P2 open (multi-user auth, Reactome / UniProt / Figshare ingestion, Helm chart, Tauri desktop bundle, additional refinements).
- 33 P3 open (longer-term roadmap items).
Architecture decisions are tracked in docs/adr/. Currently:
| ADR | Topic |
|---|---|
| 0001 | Scoring-weight calibration (L2 grid search, ground truth, r=0.92) |
| 0002 | Metadata-quality v2 (10 weighted fields, penalty curve) |
| 0003 | Dedicated /sources/status endpoint and tier classification |
| 0004 | DOC-011 demo-video recategorisation P1 → P2 |
PRs welcome. Read CONTRIBUTING.md before opening one — DCO sign-off and Conventional Commits are mandatory because release-please consumes them.
By participating you agree to abide by the Code of Conduct.
Meta Omix is a single-maintainer project. Authority and decision-making are documented in GOVERNANCE.md and MAINTAINERS.md. Branch protection, code-review gates, and CODEOWNERS enforce single-maintainer review on the GitHub side.
Apache License 2.0. See LICENSE for the full text and NOTICE for upstream attributions. Per-dependency licenses are catalogued in THIRD-PARTY-NOTICES.md.
If Meta Omix helps you find a dataset that lands in a paper, a citation is appreciated:
@software{metaomix_2026,
author = {edgarzzin},
title = {Meta Omix: A scientific instrument for biomedical dataset discovery},
year = {2026},
url = {https://github.com/edgarzzin/meta-omix},
version = {0.2.0}
}A Zenodo DOI will be minted from the v1.0 tag; until then, please cite the GitHub URL.
Meta Omix stands on the shoulders of dozens of open-source projects. A non-exhaustive thank-you list to the upstream maintainers whose work makes this possible:
- Backend: FastAPI · Pydantic · SQLAlchemy · Alembic · Celery · pgvector · DuckDB · Biopython · GEOparse · scanpy · pydeseq2 · scispacy · cryptography · argon2-cffi · keyring · Hypothesis · Ruff · uv · Snakemake.
- Frontend: React · Next.js · TanStack Query / Table / Virtual · Radix UI · Framer Motion · cmdk · Tailwind · Plotly.js · React Flow · Monaco Editor · Zod · react-hook-form · nuqs · Zustand · lucide-react · Storybook · Vitest · Playwright · Lighthouse.
- Infrastructure: PostgreSQL · Redis · MinIO · Ollama · Grafana · Loki · Promtail · OpenTelemetry · Snakemake.
Bug reports, well-formed PRs, and rigorous critiques of the scoring methodology are all considered contributions and credited in CHANGELOG.md upon release.