Skip to content

ECharria/mycomsbase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MycoMSBase logo

An open-source MS/MS spectral library for fungal natural products

MycoMSBase brings together high-resolution tandem mass spectra alongside ion mobility data and compound metadata — biosynthetic class, fungal producer, and literature references — to support the dereplication and discovery of fungal secondary metabolites in metabolomics studies.

License TypeScript Go Status

Work in progress — MycoMSBase is actively being developed. Features and data coverage are growing. Feedback, suggestions, and contributions are very welcome (see Contributing).


Contents


Features

Spectral library search

MycoMSBase supports three complementary search modes:

  • Spectral similarity search — powered by matchms, a Python library for MS/MS spectrum processing and cosine similarity scoring. Query with a peak list and a similarity threshold to retrieve the closest library matches.
  • Library matching via SpecReboot — spectra in MycoMSBase are compatible with SpecReboot, a tool for large-scale spectral library matching and dereplication in fungal metabolomics workflows.
  • Fragment peak search — search for spectra containing specific fragment ions within a configurable mass tolerance.
  • Substructure search — filter by chemical substructure using SMILES queries, powered by the Bingo extension for PostgreSQL.

Ion mobility data

MycoMSBase is one of the few fungal MS/MS libraries to include collisional cross-section (CCS) values from ion mobility measurements. CCS values are stored per spectrum and are searchable, enabling multi-dimensional dereplication that combines retention time, exact mass, MS/MS fragmentation, and ion shape.

Taxonomy integration

Fungal producer metadata is enriched with full NCBI taxonomy — genus, family, order, class, and phylum — fetched automatically via the NCBI Entrez API. The web interface provides:

  • Interactive taxonomy tree — a cladogram showing all fungal producers with biosynthetic class distributions as pie charts at each node, with rank-level zoom (Kingdom / Family / Genus).
  • Taxonomy filters — filter the library by genus or species directly in the search panel.
  • NCBI Taxonomy links — each record links out to the NCBI Taxonomy Browser entry for its producer species.

Compound metadata

Every record is annotated with:

  • Biosynthetic class (Polyketide, Terpene, NRPS-like, PKS-NRPS) — drives the charts and taxonomy tree visualisation.
  • Fungal producer with full taxonomic lineage.
  • Literature reference (DOI) linking to the original publication.

Architecture

MycoMSBase is deployed as a set of Docker services orchestrated with Compose:

Browser
  └── nginx (reverse proxy :8080)
        ├── mb3frontend   React/TypeScript web app
        ├── mb3server     Go REST API
        ├── similarity-service   Python spectral search (matchms)
        └── export-service       Java bulk export (MGF/MSP)
              └── postgres   PostgreSQL + Bingo (substructure search)
Service Technology Role
postgres PostgreSQL + Bingo Stores records; enables substructure search
mb3server Go REST API backend
similarity-service Python / FastAPI / matchms Cosine spectral similarity search
export-service Java Bulk MGF/MSP export
mb3frontend React / TypeScript Web interface
nginx nginx Routes all services under one port
mb3tool Go One-shot database initialisation from a data repo

Prerequisites

  • Docker ≥ 24 with Compose v2
  • Git (for loading data from a repository)
  • ~4 GB RAM for the full stack

Quick start

1. Clone the repository

git clone https://github.com/ECharria/mycomsbase.git
cd mycomsbase

2. Set up the environment

Copy the annotated template and edit the key variables:

cp compose/env.dist compose/.env

Minimum required changes in compose/.env:

# Where PostgreSQL data will be stored on disk
DB_LOCAL_PATH=./../data/postgres-data

# Git repository containing the MassBank .txt record files
MB_GIT_REPO="https://github.com/<your-org>/<your-data-repo>"
MB_GIT_BRANCH=main

# Local path to the record files (used by the similarity and export services)
MB_DATA_DIRECTORY="./../data/mycomsbase-data"

# Hostname or IP of the server (use localhost for local deployment)
MB3_API_HOST=localhost
MB3_FRONTEND_HOST=localhost

3. Build and launch

cd compose
docker compose build
docker compose up -d

4. Load the spectral library into the database

docker compose run --rm mb3tool

The mb3tool service clones the data repository and imports all records into PostgreSQL.

Once all services are healthy, open http://localhost:8080/MycoMSBase in your browser.


Configuration reference

All settings live in compose/.env. The fully annotated template is compose/env.dist.

Variable Default Description
DB_USER / DB_PASSWORD / DB_NAME mycomsbase / mycomsbasepassword / mycomsbase PostgreSQL credentials
DB_LOCAL_PATH ./../data/postgres-data Host path for database storage
MB3_API_BASE_URL /MycoMSBase-api API base path
MB3_FRONTEND_BASE_URL /MycoMSBase Frontend base path
MB_GIT_REPO URL of the MassBank-format data repository
MB_DATA_DIRECTORY ./../data/mycomsbase-data Local path to record .txt files
COSINE_TOLERANCE 0.05 Fragment mass tolerance in Da for similarity search
SIMILARITY_SERVICE_VERBOSE false Verbose logging in similarity service
DISTRIBUTOR_TEXT Institution name shown on the About page

Data format

Records follow the MassBank record format. MycoMSBase adds three fields to each record:

PUBLICATION: doi:10.xxxx/xxxxx          ← literature reference
CH$COMPOUND_CLASS: Polyketide           ← biosynthetic class
SP$SCIENTIFIC_NAME: Hypoxylon rickii    ← fungal producer

A reference table of all compounds (mycomsbase_unique_compounds.csv) is included at the repository root with the columns:

inchikey · compound_name · compound_class · fungal_producer · n_spectra · doi · example_accession


Development

Rebuild a single service

docker compose build mb3server           # Go backend
docker compose build mb3frontend         # React frontend
docker compose build similarity-service  # Python similarity service
docker compose up -d <service>

API

The REST API is served at http://localhost:8080/MycoMSBase-api/.
The full OpenAPI specification is at config-openapi.yaml.

Similarity service endpoints

Endpoint Method Description
/similarity POST Cosine similarity search against the library
/export/mgf POST Export selected records as MGF
/version GET Service version and loaded library size

Contributing

MycoMSBase is under active development and we warmly welcome contributions of all kinds — bug reports, feature ideas, or code improvements.

  • Open a pull request — for code changes or new records, fork the repo and submit a PR
  • Open an issue — for bug reports or feature requests
  • Send us your samples — if you have fungal samples or extracts that could expand the library, we would love to hear from you
  • Get in touch — for questions, sample contributions, or collaboration: esteban.charriagiron@wur.nl

Citing

If you use MycoMSBase, please cite the underlying MassBank3 infrastructure:

Neumann S. et al. MassBank3: the spectral reference library's next generation software product.
DOI: 10.5281/zenodo.16923315


License

Distributed under the GPL-3.0 license — see LICENSE for details.

About

MycoMSBase is a curated open-source library of fungal MS/MS spectra and metadata

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors