Skip to content

talo/cosolvent_toolkit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cosolvent Analysis Toolkit (CAT)
Authors: Frances Sabanes Zariquiey, Joao V. de Souza, Agnieszka K. Bronowska

A. General desription and purpose:
CAT (Cosolvent Analysis Toolkit) is a platform for mixture cosolvent molecular dynamics simulations analysis,
developed by the Computational Medicinal Chemistry Group at Newcastle University, UK (2018).

B. The workflow: 
CAT reads PDB frames from the trajectory generated
using GROMACS (or another MD simulation software)  and inputs the gromacs topology (topol.top) and the cosolvent
molecule itp file.
Using this, CAT is able to cluster and rank the regions with the highest 
probability of being a allosteric binding site.

C. Dependencies:
Since CAT is a full standalone code, there are no special dependencies, and you can use any C++ compiler that suits you.
We used GNU C++ compiler - gcc - which is set automatically in the default Makefile
We recomend installation of Gromacs and acpype (both packages are free of charge)
There are no special hardware requirements.

D. Compilation:
The compilation of this code was tested on Ubuntu 14.04 and Ubuntu 16.04 operating systems.

To compile CAT, start by 
cd CAT_GIT
make

This will generate CAT executable
The compilation should take less than one minute on a standard Linux workstation.

E. Usage:
1- prepare your trajectory: remove PBC artefacts, strip waters, and save the trajectory in PDB format.
 
The instructions below use standard Gromacs tools. You may choose different software package, e.g. Amber or CHARMM.   

1.1 - Remove any potential PBC artefacts (here with -pbc mol option) and strip waters
gmx trjconv -s protein.tpr -f protein.xtc -o protein.gro -center -pbc mol

Select the protein and the cosolvent molecules (a single group)

1.2 - Fit your trajectory to remove rotational and translational degrees of freedom, saving all atoms, including hydrogens 

gmx trjconv -s protein.tpr -f protein.xtc -o protein-fit.xtc -fit rot+trans

1.3 - Save the trajectory in PDB format (multiple frames) in a new folder.

gmx trjconv -f protein-fit.xtc -s protein.tpr -o snapsCAT/protein.pdb -sep

2 - Select the list of all frames you want to analyse

for i in snapsCAT/protein*.pdb ; do echo $i >> list.dat ; dont

3 - run CAT

$/CAT -i list.dat -o output.dat -c <radius_of_selection_sphere_Around_the_residue> -n <number_of_files_in_list.dat> -t topol.top -e <eletrostatic_delta_for_softcore_potential> -s <comolecule>.itp -v <vdw_delta_for_softcore_potential> -g <residues_clustering>

F. Performance:
The time CAT requires to process the snapshots is directly related to the size of your trajectory, i.e. the number of snapshots/frames; on a standard Linux workstation, it typically takes 2 minutes for a 2000 frames trajectory

G. Syntax and options
A detailed guideline is provided in the CAT_tutorial, which gives in-depth intructions and a step-by-step explation of CAT.

Input options
-i : input list of structures generated in the step E.1.3
-o : output file table with all the scores depth and data per residue
-n : number of files in the list used in -i
-t : protein topology generated in pdb2gmx (Gromacs)
-s : cosolvent molecule itp (usually generated by acpype)
-e : delta for smoothing the eletrostatic softcore potential -(if 0 is a hardcore potential)
-v : delta for smoothing the VDW softcore potential -(if 0 is a hardcore potential)
-c : the sphere radius to select interacting cosolvent molecules around each residue
-g : clustering radius per-residue sphere

CAT will generate the following output files:

all_clusters_top_spheres.pdb - the position of the clusters for the top-10 molecules
all_contact.pdb - all the spheres regions surrounding each residue with the retention score
all_spheres. pdb - all spheres per residue
aver_withB.pdb - average structure from the trajectory
all_clusters.pdb - all possible clusters
output.dat - the table with all output values:
	ele = electrostatic score
	e_VAR = variance of electrostacit score
	vdw = Van der Waals score
	v_var = Van der Waals variance
	count = arege number of molecues inside that residue sphere
	c_var = rmsd of the number of molecules inside that residue sphere

H. Instructions for data reproduction
Sample input files and trajectory files are included in the CAT_tutorial directory

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C++ 96.6%
  • Shell 2.5%
  • Makefile 0.9%