An open-source MS/MS spectral library for fungal natural products
MycoMSBase brings together high-resolution tandem mass spectra alongside ion mobility data and compound metadata — biosynthetic class, fungal producer, and literature references — to support the dereplication and discovery of fungal secondary metabolites in metabolomics studies.
Work in progress — MycoMSBase is actively being developed. Features and data coverage are growing. Feedback, suggestions, and contributions are very welcome (see Contributing).
- Features
- Architecture
- Prerequisites
- Quick start
- Configuration reference
- Data format
- Development
- Contributing
- Citing
MycoMSBase supports three complementary search modes:
- Spectral similarity search — powered by matchms, a Python library for MS/MS spectrum processing and cosine similarity scoring. Query with a peak list and a similarity threshold to retrieve the closest library matches.
- Library matching via SpecReboot — spectra in MycoMSBase are compatible with SpecReboot, a tool for large-scale spectral library matching and dereplication in fungal metabolomics workflows.
- Fragment peak search — search for spectra containing specific fragment ions within a configurable mass tolerance.
- Substructure search — filter by chemical substructure using SMILES queries, powered by the Bingo extension for PostgreSQL.
MycoMSBase is one of the few fungal MS/MS libraries to include collisional cross-section (CCS) values from ion mobility measurements. CCS values are stored per spectrum and are searchable, enabling multi-dimensional dereplication that combines retention time, exact mass, MS/MS fragmentation, and ion shape.
Fungal producer metadata is enriched with full NCBI taxonomy — genus, family, order, class, and phylum — fetched automatically via the NCBI Entrez API. The web interface provides:
- Interactive taxonomy tree — a cladogram showing all fungal producers with biosynthetic class distributions as pie charts at each node, with rank-level zoom (Kingdom / Family / Genus).
- Taxonomy filters — filter the library by genus or species directly in the search panel.
- NCBI Taxonomy links — each record links out to the NCBI Taxonomy Browser entry for its producer species.
Every record is annotated with:
- Biosynthetic class (Polyketide, Terpene, NRPS-like, PKS-NRPS) — drives the charts and taxonomy tree visualisation.
- Fungal producer with full taxonomic lineage.
- Literature reference (DOI) linking to the original publication.
MycoMSBase is deployed as a set of Docker services orchestrated with Compose:
Browser
└── nginx (reverse proxy :8080)
├── mb3frontend React/TypeScript web app
├── mb3server Go REST API
├── similarity-service Python spectral search (matchms)
└── export-service Java bulk export (MGF/MSP)
└── postgres PostgreSQL + Bingo (substructure search)
| Service | Technology | Role |
|---|---|---|
postgres |
PostgreSQL + Bingo | Stores records; enables substructure search |
mb3server |
Go | REST API backend |
similarity-service |
Python / FastAPI / matchms | Cosine spectral similarity search |
export-service |
Java | Bulk MGF/MSP export |
mb3frontend |
React / TypeScript | Web interface |
nginx |
nginx | Routes all services under one port |
mb3tool |
Go | One-shot database initialisation from a data repo |
- Docker ≥ 24 with Compose v2
- Git (for loading data from a repository)
- ~4 GB RAM for the full stack
git clone https://github.com/ECharria/mycomsbase.git
cd mycomsbaseCopy the annotated template and edit the key variables:
cp compose/env.dist compose/.envMinimum required changes in compose/.env:
# Where PostgreSQL data will be stored on disk
DB_LOCAL_PATH=./../data/postgres-data
# Git repository containing the MassBank .txt record files
MB_GIT_REPO="https://github.com/<your-org>/<your-data-repo>"
MB_GIT_BRANCH=main
# Local path to the record files (used by the similarity and export services)
MB_DATA_DIRECTORY="./../data/mycomsbase-data"
# Hostname or IP of the server (use localhost for local deployment)
MB3_API_HOST=localhost
MB3_FRONTEND_HOST=localhostcd compose
docker compose build
docker compose up -ddocker compose run --rm mb3toolThe
mb3toolservice clones the data repository and imports all records into PostgreSQL.
Once all services are healthy, open http://localhost:8080/MycoMSBase in your browser.
All settings live in compose/.env. The fully annotated template is compose/env.dist.
| Variable | Default | Description |
|---|---|---|
DB_USER / DB_PASSWORD / DB_NAME |
mycomsbase / mycomsbasepassword / mycomsbase |
PostgreSQL credentials |
DB_LOCAL_PATH |
./../data/postgres-data |
Host path for database storage |
MB3_API_BASE_URL |
/MycoMSBase-api |
API base path |
MB3_FRONTEND_BASE_URL |
/MycoMSBase |
Frontend base path |
MB_GIT_REPO |
— | URL of the MassBank-format data repository |
MB_DATA_DIRECTORY |
./../data/mycomsbase-data |
Local path to record .txt files |
COSINE_TOLERANCE |
0.05 |
Fragment mass tolerance in Da for similarity search |
SIMILARITY_SERVICE_VERBOSE |
false |
Verbose logging in similarity service |
DISTRIBUTOR_TEXT |
— | Institution name shown on the About page |
Records follow the MassBank record format. MycoMSBase adds three fields to each record:
PUBLICATION: doi:10.xxxx/xxxxx ← literature reference
CH$COMPOUND_CLASS: Polyketide ← biosynthetic class
SP$SCIENTIFIC_NAME: Hypoxylon rickii ← fungal producer
A reference table of all compounds (mycomsbase_unique_compounds.csv) is included at the repository root with the columns:
inchikey · compound_name · compound_class · fungal_producer · n_spectra · doi · example_accession
docker compose build mb3server # Go backend
docker compose build mb3frontend # React frontend
docker compose build similarity-service # Python similarity service
docker compose up -d <service>The REST API is served at http://localhost:8080/MycoMSBase-api/.
The full OpenAPI specification is at config-openapi.yaml.
| Endpoint | Method | Description |
|---|---|---|
/similarity |
POST | Cosine similarity search against the library |
/export/mgf |
POST | Export selected records as MGF |
/version |
GET | Service version and loaded library size |
MycoMSBase is under active development and we warmly welcome contributions of all kinds — bug reports, feature ideas, or code improvements.
- Open a pull request — for code changes or new records, fork the repo and submit a PR
- Open an issue — for bug reports or feature requests
- Send us your samples — if you have fungal samples or extracts that could expand the library, we would love to hear from you
- Get in touch — for questions, sample contributions, or collaboration: esteban.charriagiron@wur.nl
If you use MycoMSBase, please cite the underlying MassBank3 infrastructure:
Neumann S. et al. MassBank3: the spectral reference library's next generation software product.
DOI: 10.5281/zenodo.16923315
Distributed under the GPL-3.0 license — see LICENSE for details.