A pipeline for evaluating performance of unsupervised clustering-based species delimitation on simulated eDNA metabarcoding datasets.
- Download from source:
git clone https://github.com/ComputationalAgronomy/seq-cluster-eval.gitcd seq-cluster-eval- Install dependencies:
pip install -r requirements.txt- Install package locally:
pip install -e .- Verify installation:
pytest # or pytest tests/test_XXX.py| Step | Module | Output | Options |
|---|---|---|---|
| 1 | TreeSimulator | .nwk file | |
| 2 | SeqSimulator | .fasta file | |
| 3 | SeqEncoder | matrix | distance or feature |
| 4 | External Pacakges | embedding | PCA / MDS / UMAP |
| 5 | External Pacakges | labels | k-means / HDBSCAN |
| 6 | Metrics | results DataFrame |
See example.
Python packages: See requirements.txt
External software - IQ-TREE: http://www.iqtree.org/doc/AliSim