This is the pipeline for the assembly of the Pyramimonas orientalis virus genome. The raw data are available from PRJNA641252. See the paper for details.
- R, perl
- cutadapt
- pilon - the path
$PILON_pathshould be specified in params.cfg - prokka
- NCBI blast+
- bowtie2
- spades
- seqkit
- samtools
- dplyr
- tidyr
- bioformatr
- Biostrings
- IRanges
- mmgenome2
- Bio::SeqIO
- Bio::Tools::GFF
The blast+ database of SILVA_*_SSURef_tax_silva.fasta should be availbale from the path $SILVA_path specified in params.cfg.
The script assemble.sh assembles the data. Whenever data changes, the files filter_scaffolds_1.txt and filter_scaffolds_2.txt should be changed using the interactive mode of mmgenome2 (refer to filter_scaffolds.R for details).
The script finalize.sh produces the final output files and relies on the user-supplied list of selected scaffolds spades_scaffolds_selected.txt.