The first bioinformatics-native AI agent skill library.
Built on OpenClaw (180k+ GitHub stars). Local-first. Privacy-focused. Reproducible.
18 skills + 8,000 Galaxy tools. Local-first. No cloud. No guessing.
Snap a photo of a medication in Telegram. ClawBio identifies the drug from the packaging, queries your pharmacogenomic profile from your own genome, and returns a personalised dosage card β on your machine, in seconds:
Warfarin | CYP2C9 *1/*2 Intermediate Β· VKORC1 High Sensitivity AVOID β DO NOT USE Β· Standard dose causes over-anticoagulation in this genotype.
Or take any genetic variant (identified by its rsID β a unique label like rs9923231) and search nine genomic databases at once to find every known disease association, tissue-specific effect, and population frequency. Or estimate your genetic predisposition to conditions like type 2 diabetes by combining thousands of small-effect variants into a single polygenic risk score. Or explore the UK Biobank β a half-million-person research dataset β by asking in plain English what fields measure blood pressure, grip strength, or depression, and get back the exact field IDs, descriptions, and linked publications you need.
Every result ships with a reproducibility bundle: commands.sh, environment.yml, and SHA-256 checksums. A reviewer can reproduce your Figure 3 in 30 seconds without emailing you.
You read a paper. You want to reproduce Figure 3. So you:
- Go to GitHub. Clone the repo.
- Wrong Python version. Fix dependencies.
- Need the reference data β where is it?
- Download 2GB from Zenodo. Link is dead.
- Email the first author. Wait 3 weeks.
- Paths are hardcoded to
/home/jsmith/data/. - Two days later: still broken. You give up.
Now imagine the same paper published a skill:
python ancestry_pca.py --demo --output fig3
# Figure 3 reproduced. Identical. SHA-256 verified. 30 seconds.That's ClawBio. Every figure in your paper should be one command away from reproduction.
A skill is a domain expert's knowledge β frozen into code β that an AI agent executes correctly every time.
ChatGPT / Claude = a smart generalist who guesses at bioinformatics
π¦ ClawBio skill = a domain expert's proven pipeline that the AI executes
- Local-first: Your genomic data never leaves your laptop. No cloud uploads, no data exfiltration.
- Reproducible: Every analysis exports
commands.sh,environment.yml, and SHA-256 checksums. Anyone can reproduce it without the agent. - Modular: Each skill is a self-contained directory (
SKILL.md+ Python scripts) that plugs into the orchestrator. - MIT licensed: Open-source, free, community-driven.
Ask Claude to "profile my pharmacogenes from this 23andMe file." It'll write plausible Python. But:
- It hallucinates star allele calls and uses outdated CPIC guidelines
- It forgets CYP2D6 *4 is no-function (not reduced)
- You spend 45 minutes debugging its output
- No reproducibility bundle. No audit log. No checksums.
ClawBio encodes the correct bioinformatics decisions so the agent gets it right first time, every time.
Every ClawBio analysis ships with a reproducibility bundle β not as an afterthought, but as part of the output:
report/
βββ report.md # Full analysis with figures and tables
βββ figures/ # Publication-quality PNGs
βββ tables/ # CSV data tables
βββ commands.sh # Exact commands to reproduce
βββ environment.yml # Conda environment snapshot
βββ checksums.sha256 # SHA-256 of every input and output file
Why this matters: a reviewer can re-run your analysis in 30 seconds. A collaborator can reproduce your Figure 3 without emailing you. Future-you can regenerate results two years later from the same bundle.
| Skill | Status | Description |
|---|---|---|
| Bio Orchestrator | MVP | Routes requests to the right skill automatically |
| PharmGx Reporter | MVP | 12 genes, 51 drugs, CPIC guidelines from consumer genetic data |
| Drug Photo | MVP | Snap a medication photo β personalised dosage card from your genotype |
| ClinPGx | MVP | Gene-drug lookup from ClinPGx, PharmGKB, CPIC, and FDA drug labels |
| GWAS Lookup | MVP | Federated variant query across 9 genomic databases |
| GWAS PRS | MVP | Polygenic risk scores from the PGS Catalog for 6+ traits |
| Profile Report | MVP | Unified personal genomic report: PGx + ancestry + PRS + nutrigenomics |
| UKB Navigator | MVP | Semantic search across the UK Biobank schema |
| Equity Scorer | MVP | HEIM diversity metrics from VCF or ancestry CSV |
| NutriGx Advisor | MVP (community) | Personalised nutrigenomics β 40 SNPs, 13 dietary domains |
| Metagenomics Profiler | MVP | Kraken2 / RGI / HUMAnN3 taxonomy, resistome, and functional profiles |
| Ancestry PCA | MVP | PCA vs SGDP (345 samples, 164 populations) with confidence ellipses |
| Semantic Similarity | MVP | Semantic Isolation Index from 13.1M PubMed abstracts |
| Genome Comparator | MVP | Pairwise IBS vs George Church (PGP-1) + ancestry estimation |
| Galaxy Bridge | MVP | Search, run, and chain 8,000+ Galaxy bioinformatics tools |
| RNA-seq DE | MVP | Bulk/pseudo-bulk differential expression with QC + PCA + contrasts |
| VCF Annotator | Planned | Variant annotation with VEP, ClinVar, gnomAD |
| Lit Synthesizer | Planned | PubMed/bioRxiv search with LLM summarisation and citation graphs |
| scRNA Orchestrator | MVP | Scanpy automation: QC, optional doublet detection, clustering, marker DE analysis, visualisation |
| Struct Predictor | Planned | AlphaFold/Boltz local structure prediction |
| Repro Enforcer | Planned | Export any analysis as Conda env + Singularity + Nextflow pipeline |
Generates a pharmacogenomic report from consumer genetic data (23andMe, AncestryDNA):
- Parses raw genetic data (auto-detects format, including gzip)
- Extracts 31 pharmacogenomic SNPs across 12 genes (CYP2C19, CYP2D6, CYP2C9, VKORC1, SLCO1B1, DPYD, TPMT, UGT1A1, CYP3A5, CYP2B6, NUDT15, CYP1A2)
- Calls star alleles and determines metabolizer phenotypes
- Looks up CPIC drug recommendations for 51 medications
- Zero dependencies. Runs in < 1 second.
python pharmgx_reporter.py --input demo_patient.txt --output reportDemo result: CYP2D6 *4/*4 (Poor Metabolizer) β 10 drugs AVOID (codeine, tramadol, 7 TCAs, tamoxifen), 20 caution, 21 standard.
~7% of people are CYP2D6 Poor Metabolizers β codeine gives them zero pain relief. ~0.5% carry DPYD variants where standard 5-FU dose can be lethal. This skill catches both.
Snap a photo of any medication in Telegram. ClawBio identifies the drug from the packaging and returns a personalised dosage card against your own genotype.
- Claude vision extracts drug name and visible dose from the photo
- Cross-references your 23andMe genotype against 31 PGx SNPs
- Four-tier classification: STANDARD DOSING / USE WITH CAUTION / AVOID / INSUFFICIENT DATA
- Correct VKORC1 complement-strand handling (23andMe reports minus strand for rs9923231)
- Works for warfarin, clopidogrel, codeine, simvastatin, tamoxifen, sertraline, and 20+ others
python pharmgx_reporter.py --drug warfarin --dose "5mg" --input my_23andme.txt --output reportNo command needed in Telegram β send any medication photo and RoboTerri triggers the skill automatically.
Federated variant query across nine genomic databases in a single command:
| Database | What you get |
|---|---|
| GWAS Catalog | Genome-wide significant associations |
| gnomAD | Allele frequencies across 125,748 exomes |
| ClinVar | Clinical significance and condition links |
| Open Targets | Disease-gene evidence scores |
| Ensembl | Functional annotation, regulatory impact |
| GTEx | eQTL data, tissue-specific expression effects |
| LDlink | Linkage disequilibrium across 26 populations |
| UK Biobank PheWAS | Phenome-wide associations across 4,000+ traits |
| LOVD | Variant pathogenicity database |
python gwas_lookup.py --rsid rs3798220 --output report
python gwas_lookup.py --demo --output /tmp/gwas_lookup_demoSemantic search across the UK Biobank schema. Ask in plain English what UK Biobank measures about any phenotype β get field IDs, descriptions, data types, participant counts, and linked publications back instantly.
python ukb_navigator.py --query "grip strength" --output report
python ukb_navigator.py --field 21001 --output report # BMI
python ukb_navigator.py --demo --output /tmp/ukb_demoBuilt on a ChromaDB embedding of the full UKB Data Showcase (22,000+ fields).
Runs principal component analysis on your cohort against the SGDP reference panel (345 samples, 164 global populations):
- Contig normalisation (chr1 vs 1)
- IBD removal (related individuals filtered)
- Common biallelic SNPs only
- Confidence ellipses per population
- Publication-quality 4-panel figure generated instantly
python ancestry_pca.py --demo --output ancestry_reportDemo result: 736 Peruvian samples across 28 indigenous populations. Amazonian groups (Matzes, Awajun, Candoshi) sit in genetic space that no SGDP population occupies β genuinely underrepresented, not just in GWAS, but in the reference panels themselves.
Computes a Semantic Isolation Index for diseases using 13.1M PubMed abstracts and PubMedBERT embeddings (768-dim):
- SII (Semantic Isolation Index): higher = more isolated in literature
- KTP (Knowledge Transfer Potential): higher = more cross-disease spillover
- RCC (Research Clustering Coefficient): diversity of research approaches
- Temporal Drift: how research focus evolves over time
- Publication-quality 4-panel figure
python semantic_sim.py --demo --output sem_reportKey finding: Neglected tropical diseases are +38% more semantically isolated (P < 0.0001, Cohen's d = 0.84). 14 of the 25 most isolated diseases are Global South priority conditions. Knowledge silos kill innovation β a malaria immunology breakthrough could help leishmaniasis, but the literatures don't talk to each other.
Corpas et al. (2026). HEIM: Health Equity Index for Measuring structural bias in biomedical research. Under review.
git clone https://github.com/ClawBio/ClawBio.git && cd ClawBio
pip install -r requirements.txt
python clawbio.py run pharmgx --demoPharmGx demo runs in <2 seconds. Only needs Python 3.10+.
python clawbio.py list # See available skills
python clawbio.py run pharmgx --demo # Pharmacogenomics (1s)
python clawbio.py run equity --demo # Equity scoring (55s)
python clawbio.py run nutrigx --demo # Nutrigenomics (60s)
python clawbio.py run metagenomics --demo # Metagenomics (3s)
python clawbio.py run scrna --demo # scRNA clustering + marker detection (PBMC3k-first demo)
python clawbio.py run scrna --demo --doublet-method scrublet
# Optional doublet detection before clustering
python clawbio.py run compare --demo # Manuel Corpas vs George Church (10s)
python clawbio.py run gwas-lookup --demo # rs3798220 across 9 databases (5s)
python clawbio.py run prs --demo # Polygenic risk scores (10s)
python clawbio.py run ukb-navigator --demo # UK Biobank schema search (5s)
python clawbio.py run profile --demo # Unified genomic profile (30s)
python clawbio.py run galaxy --demo # Galaxy Bridge FastQC demo (offline)
python clawbio.py run rnaseq --demo # RNA-seq DE demo (bulk/pseudo-bulk)python clawbio.py run pharmgx --input my_23andme.txt --output results/
python clawbio.py run rnaseq --input counts.csv,metadata.csv --output results_rnaseq/pip install pytest
python -m pytestGet your own ClawBio Telegram bot running in the cloud β no coding required.
You'll need:
- A Telegram bot token β message @BotFather on Telegram, send
/newbot - A free LLM API key β get one at aistudio.google.com/apikey
Railway will ask you to paste these during setup. That's it β your bot will be live in ~2 minutes.
Privacy note: When running on Railway, genetic data is processed on Railway's servers (not your machine). Data is not sent to external APIs, but it does exist on the Railway container temporarily. For maximum privacy, use the local install instead.
RoboTerri β ClawBio's Telegram agent, inspired by Prof. Teresa K. Attwood
ClawBio skills are available through RoboTerri, a public Telegram bot running against a real human genome (Manuel Corpas, CC0 public domain). Named after Prof. Teresa K. Attwood β a pioneer of bioinformatics education, founding Chair of GOBLET, and winner of the 2021 ISCB Outstanding Contributions Award.
Try RoboTerri now β no install needed β
Ask it anything:
- "Give me my pharmacogenomic summary" β analyses 12 genes, 51 drugs
- "What diseases am I at risk for?" β polygenic risk scores for 6 conditions
- Send a photo of any medication β checks CYP2D6/CYP2C19 metaboliser status
/demo pharmgx/demo prs/demo nutrigx/demo compare/demo profile
You: [send 23andMe file]
RoboTerri: Running PharmGx Reporter...
CYP2D6 *4/*4 β Poor Metabolizer β 10 drugs AVOID
[report.md attached]
[3 figures attached]
You: [send photo of warfarin packet]
RoboTerri: Warfarin detected. Running Drug Photo skill...
CYP2C9 *1/*2 Β· VKORC1 High Sensitivity
AVOID β DO NOT USE at standard dose.
You: run gwas-lookup rs3798220
RoboTerri: Querying 9 databases...
rs3798220 (LPA) β coronary artery disease, Lp(a) levels.
eQTL in liver (GTEx). gnomAD MAF 0.07.
RoboTerri auto-detects file type (23andMe .txt, AncestryDNA .csv, VCF, FASTQ) and routes to the right skill via the Bio Orchestrator. Photos of medications trigger the Drug Photo skill automatically β no command needed.
Install your own RoboTerri: Set up your own Telegram bot running ClawBio skills in ~20 minutes.
ClawBio indexes 8,000+ bioinformatics tools from usegalaxy.org via the Galaxy Bridge skill. Search by natural language, inspect tool schemas, and execute remotely β all from the CLI.
# Search Galaxy tools by keyword
python skills/galaxy-bridge/galaxy_bridge.py --search "metagenomics"
# Browse all 86 tool categories
python skills/galaxy-bridge/galaxy_bridge.py --list-categories
# Run a tool on Galaxy (requires GALAXY_API_KEY)
python skills/galaxy-bridge/galaxy_bridge.py --run fastqc --input reads.fq.gz --output results/
# Demo mode (offline, no API key)
python skills/galaxy-bridge/galaxy_bridge.py --demoCross-platform chaining: Galaxy VEP annotates variants β ClawBio PharmGx generates dosage report. Galaxy Kraken2 classifies reads β ClawBio metagenomics profiler. Neither can do this alone.
Built on BioBlend (Galaxy Python SDK). Developed in collaboration with the Galaxy ML SIG.
Telegram (RoboTerri) CLI (clawbio.py) Python (import clawbio)
β β β
ββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
ββββββββΌβββββββ
β Bio β β routes by file type + keywords
β Orchestratorβ
ββββββββ¬βββββββ
β
βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β β
PharmGx Equity NutriGx Metagenomics Ancestry
Reporter Scorer Advisor Profiler PCA ...
β β
βββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββΌβββββββ
β Markdown β β report + figures + checksums
β Report β + reproducibility bundle
βββββββββββββββ
Each skill is standalone β the orchestrator routes to the right one, but every skill also works independently. The clawbio.run_skill() API is importable by any agent (RoboTerri, RoboIsaac, Claude Code).
See docs/architecture.md for the full design.
ClawBio is designed to be discovered and used by AI coding agents, not just humans.
| Resource | Purpose |
|---|---|
llms.txt |
Token-optimized project summary for any LLM (llmstxt.org standard) |
AGENTS.md |
Universal guide for AI coding agents β setup, commands, style, structure, git workflow |
CLAUDE.md |
Claude-specific routing table, CLI reference, demo commands, safety rules |
skills/catalog.json |
Machine-readable skill index with trigger keywords, chaining partners, and demo commands |
Agents can also run python clawbio.py list to discover available skills programmatically.
We want skills from the bioinformatics community. If you work with genomics, proteomics, metabolomics, imaging, or clinical data β wrap your pipeline as a skill.
| Skill | What | Your expertise |
|---|---|---|
| claw-gwas | PLINK/REGENIE automation | Statistical genetics |
| claw-acmg | Clinical variant classification | Clinical genomics |
| claw-pathway | GO/KEGG enrichment | Functional genomics |
| claw-phylogenetics | IQ-TREE/RAxML automation | Evolutionary biology |
| claw-proteomics | MaxQuant/DIA-NN | Proteomics |
| claw-spatial | Visium/MERFISH | Spatial transcriptomics |
See CONTRIBUTING.md for the submission process and templates/SKILL-TEMPLATE.md for the skill template.
Join the contributors community on Telegram: t.me/ClawBioContributors
ClawBio is built on OpenClaw. On 1 March 2026, at the UK AI Agent Hack at Imperial College London, Manuel Corpas introduced ClawBio to Peter Steinberger β the creator of OpenClaw itself.
Manuel Corpas introduces ClawBio to Peter Steinberger Β· UK AI Agent Hack, Imperial College London Β· Watch on YouTube β
ClawBio was presented at DoraHacks Demo Day at Imperial College London on 7 March 2026. Live demo: pharmacogenomics, intelligent routing, multi-channel agents, and Drug Photo.

ClawBio at DoraHacks Demo Day Β· Imperial College London Β· Watch on YouTube β
ClawBio was announced at the London Bioinformatics Meetup on 26 February 2026.
- Slides: clawbio.github.io/ClawBio/slides/
- Talk: 10 Tips for Becoming a Top 1% AI User β with live demos of all three MVP skills
If you use ClawBio in your research, please cite:
@software{clawbio_2026,
author = {Corpas, Manuel},
title = {ClawBio: An Open-Source Library of AI Agent Skills for Reproducible Bioinformatics},
year = {2026},
url = {https://github.com/ClawBio/ClawBio}
}- 𧬠Try RoboTerri: t.me/RoboTerri_bot β Query a real genome on Telegram, no install needed
- π¦ Slides: clawbio.github.io/ClawBio/slides/
- π¦ Tutorial: Install your own RoboTerri
- OpenClaw β The agent platform
- ClawHub β Skill registry
- HEIM Index β Health Equity Index for Minorities
MIT β clone it, run it, build a skill, submit a PR. π¦


