Skip to content

edgarzzin/meta-omix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Meta Omix logo

Meta Omix

Discover underexplored biomedical datasets through transparent, deterministic scoring.

A scientific instrument for finding GEO, SRA, Zenodo, ENA, HCA, Expression Atlas, and Open Targets datasets that deserve a second look — local-first, BYOK, fully auditable.

CI Security Release License Coverage Lighthouse a11y Bundle Docker images SBOM signed

Documentation · Changelog · Roadmap · ADRs · Security · Contributing

Demo video: <!-- VIDEO_URL -->  ·  recording recategorised P1 → P2 in ADR-0004


Why Meta Omix

GEO and SRA together index over 1.4 million biomedical datasets. The vast majority are cited fewer than five times. Researchers re-analyse the same handful of famous accessions because the ones with quiet potential are buried by a flat search interface, by sparse metadata, and by the cost of triaging hundreds of candidates by hand.

Meta Omix scores every dataset it ingests on six deterministic components — reuse potential, citation underexploration, disease relevance, metadata completeness, data-type leverage, recency — and surfaces the under-cited datasets where the science is strong, the metadata is rich, and a re-analysis would actually move the needle.

The scoring formula is calibrated against a curated ground truth (Pearson r = 0.92, Spearman r = 0.90), versioned in ADR-0001, and fully reproducible offline.

The score is the heart. The LLM is decoration.

Cloud LLMs are an opt-in narrative layer that explains why a score is what it is — they never decide what the score is.

Table of contents

Highlights

  • 🧬 Deterministic 6-component scoring — calibrated via L2-regularised grid search against 51 ground-truth datasets. Pearson r = 0.92, Spearman r = 0.90. ADR-versioned.
  • 🌐 10 ingestion sources wired with circuit breakers, retries, and per-source rate limits — GEO, PubMed, iCite, SRA, Zenodo, Europe PMC, ENA, HCA, Expression Atlas, Open Targets.
  • 🔐 BYOK with AES-GCM 256 + Argon2id master-password derivation; keys never leave your machine. Nine cloud providers + Ollama local. Provenance recorded for every model call.
  • 🧪 Three Snakemake pipelines out of the box: bulk RNA-seq (STAR + Salmon + DESeq2), single-cell scRNA (scanpy), microbiome diversity (DADA2 + scikit-bio).
  • 📐 22 UI routes, WCAG 2.2 AA, Lighthouse a11y ≥ 95 enforced in CI; design tokens cover dark + light + density-compact.
  • 🔭 Observability built-in — OpenTelemetry traces, Prometheus metrics (metaomix_* counters / gauges / histograms), Loki + Promtail + Grafana dashboards.
  • 🛡️ Supply-chain hardened — multi-arch Docker images, SBOM signed with cosign, attestations, dependabot, CodeQL, gitleaks, Trivy, OSV-Scanner, dependency-review.
  • 🧾 Provenance everywhere — every LLM-generated artefact carries provider, model, seed, prompt_hash, response_hash, fallback_used, fallback_reason.
  • 🌱 Local-first by defaultdocker compose up, no cloud account required. The platform is fully usable without a single API key.
  • ⚙️ Zero-touch bootstrap — Postgres extensions, Alembic migrations, MinIO bucket, Ollama models, all set up automatically on the first up. No manual follow-up commands.
  • 📜 Apache 2.0 — explicit patent grant, institution-friendly, single-maintainer governance.

Quickstart — one command

git clone https://github.com/edgarzzin/meta-omix.git
cd meta-omix
docker compose up -d

That's it. Sixty seconds later open http://localhost:3000.

The defaults in docker-compose.yml are designed so the first up produces a fully functional stack:

  • Postgres extensions installed (infra/postgres-init/00-extensions.sql).
  • Alembic migrated to head automatically by the api-migrate init container.
  • MinIO bucket created and made publicly readable for object downloads (minio-init).
  • Ollama bootstrapped with the default chat model (qwen2.5:7b) and the default embedding model (nomic-embed-text) (ollama-init).
  • Eight long-running services healthy: postgres, redis, minio, ollama, api, worker, beat, flower, web.
  • Three one-shot init services exit cleanly: api-migrate, minio-init, ollama-init.

If you want to override anything, copy the env template:

cp .env.example .env
$EDITOR .env             # toggle ENABLE_TELEMETRY, point OLLAMA at a host install, …
docker compose up -d     # picks up .env automatically

To skip the Ollama auto-pull (e.g. on a bandwidth-constrained machine):

OLLAMA_AUTO_PULL=false docker compose up -d

To stop everything:

docker compose down       # stop containers, keep volumes
docker compose down -v    # stop AND wipe volumes (postgres, minio, ollama models)

What docker compose up actually does

The bring-up is choreographed through depends_on health and completion gates so the human never has to follow up with manual commands. Top to bottom:

1. postgres ──► healthy (pg_isready)
2. redis ─────► healthy (redis-cli ping)
3. minio ─────► healthy (mc ready local)
4. ollama ────► healthy (ollama list)

5. api-migrate ──► (depends on postgres) ──► alembic upgrade head ──► exit 0
6. minio-init ───► (depends on minio)    ──► create bucket + policy ──► exit 0
7. ollama-init ──► (depends on ollama)   ──► pull default models    ──► exit 0

8. api ──► (depends on postgres + redis + minio + api-migrate + minio-init)
        ──► uvicorn → listening on :8000

9. worker ──► (depends on api + api-migrate)
            ──► celery -A app.workers.celery_app worker

10. beat ────► (depends on worker)
             ──► celery beat with RedBeat scheduler

11. flower ──► (depends on redis)
            ──► port 5556 (browser) / 5555 (internal)

12. web ──────► (depends on api healthy)
              ──► next start → listening on :3000

docker compose up -d returns when every long-running service is healthy. The init containers do not block subsequent up invocations because Alembic, mc mb, and ollama pull are all idempotent.

Architecture

graph LR
  classDef source fill:#eef4ff,stroke:#3a6fd6,color:#0e1f44;
  classDef api fill:#f5f6f8,stroke:#404750,color:#111316;
  classDef store fill:#e1dbf6,stroke:#5a48bd,color:#150f37;
  classDef ai fill:#f3f0fb,stroke:#5a48bd,color:#322777;
  classDef ui fill:#eef4ff,stroke:#3a6fd6,color:#0e1f44;
  classDef obs fill:#dcecfa,stroke:#1d7387,color:#0a3a48;

  subgraph Sources [Public scientific archives]
    GEO[GEO]:::source
    PubMed[PubMed]:::source
    iCite[iCite]:::source
    SRA[SRA]:::source
    Zenodo[Zenodo]:::source
    EPMC[Europe PMC]:::source
    ENA[ENA]:::source
    HCA[HCA Data Portal]:::source
    GXA[Expression Atlas]:::source
    OT[Open Targets]:::source
  end

  subgraph Backend [Meta Omix backend - FastAPI]
    Ingest[Ingestion service]:::api
    Score[Scoring engine]:::api
    NLP[NER + ontology mapping]:::api
    AI[Provider registry]:::api
  end

  subgraph Data [Persistence]
    PG[(PostgreSQL + pgvector)]:::store
    Redis[(Redis)]:::store
    MinIO[(MinIO / S3)]:::store
    DuckDB[(DuckDB)]:::store
  end

  subgraph Local [Local LLM]
    Ollama((Ollama)):::ai
  end

  subgraph Cloud [Optional BYOK cloud providers]
    CloudAI[Anthropic / OpenAI / Bedrock<br/>Azure / Groq / Mistral<br/>Gemini / Cohere / Voyage]:::ai
  end

  subgraph UI [Frontend - Next.js 15]
    Web[22 routes<br/>Server + Client Components]:::ui
  end

  subgraph Obs [Observability]
    OTEL[OpenTelemetry Collector]:::obs
    Prom[Prometheus]:::obs
    Loki[Loki]:::obs
    Graf[Grafana]:::obs
  end

  Sources -->|HTTP / Entrez / GraphQL| Ingest
  Ingest --> PG
  Score --> PG
  NLP --> PG
  AI --> Ollama
  AI -.opt-in.-> CloudAI
  Backend --> Redis
  Backend --> MinIO
  Backend --> DuckDB
  Web -->|/api/v1/*| Backend
  Backend --> OTEL
  OTEL --> Prom
  OTEL --> Loki
  Prom --> Graf
  Loki --> Graf
Loading

Detailed architecture documentation lives in apps/docs/docs/architecture.md.

Service map

Twelve services run from docker-compose.yml. The three init services exit with code 0; the rest stay up.

Service Image Purpose Port (host bind) Lifecycle
postgres pgvector/pgvector:pg16 Primary store + vector search (HNSW index) internal only long-running
redis redis:7-alpine Cache + Celery broker (DB 0/1/2) + RedBeat scheduler internal only long-running
minio minio/minio S3-compatible object store for reports, sample-matrix cache, notebooks, pipeline outputs console 127.0.0.1:9101 long-running
ollama ollama/ollama:0.5.4 Local LLM runtime (default Meta Omix AI provider) 127.0.0.1:11434 long-running
api metaomix/api:dev FastAPI backend, 55 OpenAPI paths, SSE for ingestion events 127.0.0.1:8000 long-running
worker metaomix/api:dev Celery worker (queues: ingestion, ai-cpu, ai-cloud, reports, maintenance) internal only long-running
beat metaomix/api:dev RedBeat scheduler — nightly refresh, weekly rescore, nightly Postgres backup internal only long-running
flower mher/flower:2.0 Celery dashboard, embedded as a sandboxed iframe in /operations/jobs 127.0.0.1:5556 long-running
web metaomix/web:dev Next.js 15 frontend, 22 routes 127.0.0.1:3000 long-running
api-migrate metaomix/api:dev One-shot: alembic upgrade head exits 0
minio-init minio/mc One-shot: mc mb + mc anonymous set download exits 0
ollama-init ollama/ollama One-shot: ollama pull for chat + embed default models exits 0

Optional observability stack lives in docker-compose.observability.yml (grafana, loki, promtail, prometheus, otel-collector) and is brought up with docker compose -f docker-compose.observability.yml up -d.

Scoring methodology

Each dataset receives a single opportunity score in [0, 100], the weighted sum of six components:

Component Weight What it captures
Reuse potential 0.22 Are samples deep enough, raw enough, and licensed permissively enough that re-analysis is feasible?
Citation underexploration 0.20 Is this dataset cited fewer times than its sample size + recency would predict?
Disease relevance 0.16 Is the disease a high-burden, high-research-priority area (NIH categorical spending + DALYs)?
Metadata completeness 0.16 How rich is the metadata (10 weighted fields, scored via ADR-0002)?
Data-type leverage 0.14 RNA-seq / scRNA-seq / microbiome / multi-omics weighted by analysis-tooling maturity.
Recency 0.12 Decay function favouring datasets recent enough to use modern reference genomes / catalogues.

Calibration. Weights were fit by L2-regularised grid search against a 51-dataset ground truth curated by domain experts, with the regularisation prior keeping the weights close to a documented prior in apps/api/app/services/scoring/weights.py. Pearson r = 0.92, Spearman r = 0.90 against the held-out fold. Methodology and results live in ADR-0001. The full per-component breakdown is documented in docs/scoring.md.

Reproducibility. The calibration data lives at data/{ground_truth.csv, dataset_metadata.csv, calibration_report.json}. The calibration script is apps/api/app/scripts/calibrate_weights.py. Re-running it from a clean checkout reproduces the published weights.

Data sources

Source Tier Type Example accession
GEO — NCBI Gene Expression Omnibus T1 Bulk + scRNA + microarray GSE12345
PubMed — bibliographic enrichment T1 Citations PMID:34567890
iCite — NIH citation metrics T1 Citations iCite:34567890
SRA — Sequence Read Archive T2 Raw reads SRP123456 / PRJNA234567
Zenodo — open scientific repository T2 Reanalysis bundles zenodo:7654321
Europe PMC — fallback for PubMed T2 Citations PMC1234567
ENA — European Nucleotide Archive T3 Raw reads PRJEB12345 / ERX234567
HCA Data Portal — Human Cell Atlas T3 Single-cell <project-uuid>
Expression Atlas — EBI processed RNA-seq T3 Processed counts E-MTAB-100
Open Targets — gene-disease evidence T3 Genetic associations EFO_0000249

Roadmap (P2): Reactome, UniProt, ArrayExpress / BioStudies, Figshare. Tracked in BACKLOG.md §1.

BYOK and AI providers

Meta Omix uses LLMs as decoration, not as the scoring engine. Providers are pluggable, the local Ollama path is the default, and no key ever leaves the user's machine unless the user explicitly configures the provider through the Settings UI.

  • At rest: AES-GCM 256, master key derived from the OS keychain (keyring) by default, or via Argon2id from a master password.
  • In transit: HTTPS to the upstream provider's API.
  • In memory: kept only for the duration of a single request, never logged.
  • Audit log: every secret event (secret.created, secret.updated, secret.rotated, secret.deleted, secret.used) recorded in audit_logs.

Supported providers, all opt-in, all replaceable:

Provider Tasks Notes
Ollama (local) summarise / embed Default. No key required.
Anthropic summarise / report Claude Opus / Sonnet / Haiku
OpenAI summarise / embed / report GPT family
Azure OpenAI summarise / embed / report endpoint + deployment_name + api_version
AWS Bedrock summarise / report aws_access_key_id + secret + region
Groq summarise Llama / Mixtral fast inference
Mistral summarise / embed api.mistral.ai
Google Gemini summarise / embed Generative Language API
Cohere summarise / embed / rerank rerank used by hybrid search
OpenAI-compatible summarise / embed catch-all for self-hosted / Together / DeepInfra / Fireworks

A provider fallback chain ensures graceful degradation: if a configured cloud provider fails, the registry falls back to the next available one, eventually to Ollama, and the response carries fallback_used = true plus a fallback_reason for full transparency. See apps/api/app/ai/providers/registry.py.

Pipelines

Three Snakemake pipelines are bundled and orchestrated through the /pipelines/[kind] UI route. Each is fully reproducible (--use-conda) and writes outputs to MinIO under pipelines/<kind>/<run-timestamp>/.

  • pipelines/snakemake/rnaseq/ — bulk RNA-seq. FastQC → fastp trim → STAR alignment → Salmon quantification → pydeseq2 differential expression → MultiQC aggregation.
  • pipelines/snakemake/scrna/ — single-cell tutorial pipeline. Downloads PBMC 3k from 10x Genomics, runs scanpy QC → normalise → PCA → neighbours → Leiden → UMAP → marker genes.
  • pipelines/snakemake/microbiome/ — diversity metrics. DADA2 (R) feature-table prep, scikit-bio Shannon / Chao1 / observed_otus / Bray-Curtis.

Run history is persisted in the pipeline_runs Postgres table and surfaced in the UI through /pipelines/[kind]. The DAG itself is rendered live from the Snakefile by an in-house parser (apps/web/lib/snakefile-parser.ts) into a React Flow canvas with mini-map and per-rule popovers.

Observability

docker compose -f docker-compose.observability.yml up -d
# Grafana: http://localhost:3001  (admin / admin)
# Prometheus: http://localhost:9090
# Loki: http://localhost:3100

Three dashboards are provisioned out of the box:

Dashboard Source What it shows
API health OTEL traces + Prometheus request rate, P95 latency, error rate, BYOK fallback rate
Ingestion metaomix_ingestion_* counters per-source records ingested, circuit-breaker state, retry counts
AI providers metaomix_ai_chat_* + metaomix_embedding_cache_* calls / 24 h, success rate, latency P95, cache hit rate

Custom metrics are exported under the metaomix_ prefix; see apps/api/app/observability/metrics.py.

Configuration

Every variable below is optional — defaults work end-to-end. Override only what you need by writing to a .env file in the repo root (Compose picks it up automatically).

Variable Default Purpose
ENV development development / staging / production / test
LOG_LEVEL INFO DEBUG / INFO / WARNING / ERROR
SECRET_KEY change-me-… App secret. Override in production.
API_HOST / API_PORT 0.0.0.0 / 8000 API bind
API_BASE_URL http://localhost:8000 Used by the web app and by webhooks
API_CORS_ORIGINS http://localhost:3000,… Comma-separated origins
DATABASE_URL postgresql+asyncpg://metaomix:metaomix-local@postgres:5432/metaomix Postgres + pgvector
ALEMBIC_DATABASE_URL psycopg sync DSN Used by api-migrate only
REDIS_URL redis://redis:6379/0 Cache + Celery broker (DB 1, 2 for queues)
STORAGE_BACKEND minio minio / s3 / r2 / b2 / azure / gcs / local_fs
MINIO_ENDPOINT / MINIO_BUCKET minio:9000 / metaomix Object storage (default)
MINIO_CONSOLE_PORT 9101 Host port for the MinIO web console
OLLAMA_BASE_URL http://ollama:11434 Local LLM endpoint
OLLAMA_DEFAULT_MODEL qwen2.5:7b Chat-capable model fallback
OLLAMA_DEFAULT_EMBED_MODEL nomic-embed-text Default embedding model
OLLAMA_AUTO_PULL true Toggle for the ollama-init bootstrap
EMBEDDINGS_BACKEND local local / ollama / openai / voyage / cohere / mistral
EMBEDDINGS_MODEL pritamdeka/S-PubMedBert-MS-MARCO Default sentence-transformers model
ENABLE_AUTH false Enable API-key middleware (multi-user is roadmap)
ENABLE_TELEMETRY false OpenTelemetry exporter switch
SENTRY_DSN unset Optional error tracking
OTEL_EXPORTER_OTLP_ENDPOINT http://otel-collector:4318/v1/traces If telemetry on
NCBI_API_KEY unset Boost NCBI rate limits 3 → 10 req / s
NCBI_EMAIL unset Required by NCBI Entrez
RATE_LIMIT_PER_MINUTE 120 Default rate limit
RATE_LIMIT_INGEST_PER_MINUTE 5 Ingestion endpoint rate limit
FLOWER_URL http://flower:5555 Internal URL the API uses to probe Flower
FLOWER_EXTERNAL_URL http://localhost:5556 Browser-facing URL embedded by /operations/jobs
FLOWER_PORT 5556 Host port for Flower

Full template lives in .env.example.

Development

Prerequisites

Tool Version Notes
Docker 24+ Tested with Docker Engine 24-28 + Compose v2
Docker Compose v2 bundled with modern Docker Desktop / Linux installs
Node 22+ .nvmrc pins the LTS
pnpm 9+ corepack prepare pnpm@9.15.0 --activate
Python 3.12 pinned in .python-version and pyproject.toml
uv 0.5+ curl -LsSf https://astral.sh/uv/install.sh | sh
mise (optional) latest resolves all of the above from .mise.toml

Monorepo layout

.
├── apps/
│   ├── api/        # FastAPI backend
│   ├── web/        # Next.js 15 frontend
│   └── docs/       # Docusaurus documentation site
├── pipelines/
│   ├── notebooks/  # Jupyter starters per data type
│   └── snakemake/  # rnaseq, scrna, microbiome
├── infra/
│   ├── docker/     # Dockerfiles
│   ├── observability/   # Grafana + Loki + Promtail provisioning
│   └── postgres-init/   # Schema bootstrap (extensions)
├── docs/
│   └── adr/        # Architecture Decision Records
├── packages/       # shared-types + ui (workspace stubs for future expansion)
└── data/           # Provisional ground truth + calibration report

Daily loop

# Bring up infra + auto-reloading API + Next dev server (still one command)
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
pnpm dev

# Backend tests with coverage
( cd apps/api && uv run pytest --cov=app --cov-fail-under=80 )

# Frontend tests with watch mode
pnpm --filter @metaomix/web test:watch

# Storybook
pnpm --filter @metaomix/web storybook

# Codegen drift check (after changing API routes)
pnpm --filter @metaomix/web codegen:api

Pre-commit

pnpm exec lefthook install
pre-commit install

Driving the CLI directly

The metaomix-cli lives inside the api container. The platform is fully usable through the UI without invoking it by hand, but you can still drive it directly:

docker compose exec api uv run metaomix-cli ingest geo --term "alzheimer" --max-results 25
docker compose exec api uv run metaomix-cli score-all
docker compose exec api uv run metaomix-cli embed-papers
docker compose exec api uv run metaomix-cli backup

The same surface is exposed through the UI at /operations/ingestion, /operations/jobs, and the /pipelines/[kind] pages.

Testing

Layer Tool Coverage / threshold
Backend unit + integration pytest + hypothesis + respx --cov-fail-under=80
Frontend unit vitest + @testing-library/react ≥ 70 % statements / ≥ 50 % branches
Frontend component a11y Storybook + axe-core 0 critical violations
End-to-end Playwright 12 flows; runs against docker compose up in CI
Visual regression Playwright snapshots maxDiffPixelRatio: 0.02 over 22 routes
Page-level a11y / perf Lighthouse CI a11y ≥ 95, perf ≥ 95 over 25 paths
Property-based Hypothesis invariant 0 ≤ score ≤ 100, idempotency of normalisation
OpenAPI drift openapi-typescript + codegen-check CI fails if generated types diverge
Bundle budget custom node script entry chunks < 130 KB gzipped; Plotly / React Flow / React-PDF / Monaco out of entry

Deployment

Official Docker images are built and signed by CI:

ghcr.io/edgarzzin/meta-omix-api:0.2.0   # multi-arch: amd64 + arm64
ghcr.io/edgarzzin/meta-omix-web:0.2.0   # multi-arch: amd64 + arm64

The reference deployment is the docker-compose.yml shipped in this repo; the same automation that runs locally (api-migrate, minio-init, ollama-init) applies on a server. A Helm chart for Kubernetes is on the roadmap (BACKLOG.md BE-INFRA-041).

Supply chain

Meta Omix takes supply-chain hardening seriously. Every release ships:

  • Multi-arch images (amd64, arm64) built reproducibly from infra/docker/.
  • SBOM in SPDX-JSON format generated by syft, signed with cosign.
  • Build attestations (SLSA provenance) emitted by GitHub Actions.
  • Dependabot updates for pip / npm / docker / github-actions weekly.
  • CodeQL for Python and JavaScript with security-and-quality query pack.
  • gitleaks on every push and PR.
  • Trivy filesystem + image scans for known CVEs.
  • OSV-Scanner for advisory data crossed against pnpm-lock.yaml and apps/api/uv.lock.
  • dependency-review action blocking GPL-incompatible licenses on PR.
  • DCO check enforcing Signed-off-by on every commit.

CI matrix and per-job permissions live in .github/workflows/. Branch protection on main is documented in .github/BRANCH_PROTECTION.md.

Roadmap

The full backlog is the source of truth: BACKLOG.md.

  • 0 P0 open (all v0.1 blockers shipped).
  • 0 P1 open (Sessão 4 closed all 35 — see the audit at the top of BACKLOG.md).
  • 65 P2 open (multi-user auth, Reactome / UniProt / Figshare ingestion, Helm chart, Tauri desktop bundle, additional refinements).
  • 33 P3 open (longer-term roadmap items).

Architecture decisions are tracked in docs/adr/. Currently:

ADR Topic
0001 Scoring-weight calibration (L2 grid search, ground truth, r=0.92)
0002 Metadata-quality v2 (10 weighted fields, penalty curve)
0003 Dedicated /sources/status endpoint and tier classification
0004 DOC-011 demo-video recategorisation P1 → P2

Contributing

PRs welcome. Read CONTRIBUTING.md before opening one — DCO sign-off and Conventional Commits are mandatory because release-please consumes them.

By participating you agree to abide by the Code of Conduct.

Governance

Meta Omix is a single-maintainer project. Authority and decision-making are documented in GOVERNANCE.md and MAINTAINERS.md. Branch protection, code-review gates, and CODEOWNERS enforce single-maintainer review on the GitHub side.

License

Apache License 2.0. See LICENSE for the full text and NOTICE for upstream attributions. Per-dependency licenses are catalogued in THIRD-PARTY-NOTICES.md.

Citation

If Meta Omix helps you find a dataset that lands in a paper, a citation is appreciated:

@software{metaomix_2026,
  author    = {edgarzzin},
  title     = {Meta Omix: A scientific instrument for biomedical dataset discovery},
  year      = {2026},
  url       = {https://github.com/edgarzzin/meta-omix},
  version   = {0.2.0}
}

A Zenodo DOI will be minted from the v1.0 tag; until then, please cite the GitHub URL.

Acknowledgements

Meta Omix stands on the shoulders of dozens of open-source projects. A non-exhaustive thank-you list to the upstream maintainers whose work makes this possible:

  • Backend: FastAPI · Pydantic · SQLAlchemy · Alembic · Celery · pgvector · DuckDB · Biopython · GEOparse · scanpy · pydeseq2 · scispacy · cryptography · argon2-cffi · keyring · Hypothesis · Ruff · uv · Snakemake.
  • Frontend: React · Next.js · TanStack Query / Table / Virtual · Radix UI · Framer Motion · cmdk · Tailwind · Plotly.js · React Flow · Monaco Editor · Zod · react-hook-form · nuqs · Zustand · lucide-react · Storybook · Vitest · Playwright · Lighthouse.
  • Infrastructure: PostgreSQL · Redis · MinIO · Ollama · Grafana · Loki · Promtail · OpenTelemetry · Snakemake.

Bug reports, well-formed PRs, and rigorous critiques of the scoring methodology are all considered contributions and credited in CHANGELOG.md upon release.

About

Discover underexplored biomedical datasets through transparent, deterministic scoring. A scientific instrument for finding GEO, SRA, Zenodo, ENA, HCA, Expression Atlas, and Open Targets datasets that deserve a second look — local-first, BYOK, fully auditable.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors