You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A mixture of scripts and libraries to help with sequence data manipulation, tree parsing, and other things.
Author
Gregg Thomas
About
These scripts can be used for many tasks including sequence handling, tree making, and sequence alignment.
Some of these programs are mainly used as wrappers to easily run other genomics or phylogenetics programs on a bunch of files. Pay attention to the dependencies for each script to make sure you have the proper programs installed.
Please note that many of these scripts expect input as FASTA files. For my scripts, these must have the extension .fa. If you don't have FASTA formatted files, you can use seq_convert to get them to FASTA format and fa_edit to make any changes you need to them afterwards.
Lots of functions for Newick trees. Used by a lot of my scripts.
corelib/wrapperlib.py
All the modules to run the other programs in wrappers.py
dev/
Some scripts that I use, but only occasionally.
legacy-scripts/
The old versions of FASTA handling scripts (before fasta_nd_furious.) and wrappers for other programs (before wrappers.py).
fasta_nd_furious.py
A general purpose FASTA handling script. Can count positions in sequences and alignments, concatentate alignments, combine and split .fa files, trim and relabel headers, remove sequences and start positions, and replace specific states in sequences.
how_many_trees
Just a little script to show the number of possible rooted tree topologies for a given number of species.
isofilter.py
A script for filtering all but the longest isoform from a set of proteins. Works with Ensembl and NCBI data.
paml_lrt.py
After using wrappers.py to run the branch-site test with codeml, this script does the likelihood ratio tests for all genes and reports those that pass.
seq_convert.py
Can convert sequences between FASTA (.fa), Phylip (.ph), and Nexus (.nex) formats. Note that these formats often vary in small ways between users, so this might not work right away for you. Consider this in Beta.
tree.py
Some general purpose Newick tree handling modules. Can join or separate directories or files of trees, label internal nodes of trees, check if trees are rooted, root trees, calculate concordance factors, and count and relabel tips.
This script is big. It is meant as a wrapper for all of the other evolutionary analysis programs I use, including MUSCLE, PASTA, GBlocks, RAxML, codeml, SDM, r8s, and Notung. Many of these programs are meant to work on a single file, or require an input control file, or the data to be rearranged between steps. This script hopefully does that for all of these programs (for the analyses I usually do). Obviously, to run one of the modules you will need that program installed!
About
My personal core functions and scripts for manipulating sequence data, phylogenetic trees, and other things.