Skip to content

ComputationalAgronomy/PDB_Convex_Hull

Repository files navigation

PDB_Convex_Hull

Sequence-based methods miss structurally conserved regions with low sequence conservation, so this pipeline applies the convex hull algorithm to identify conserved local structural topologies directly from 3D protein structures. It builds polyhedrons from 30-residue sliding window segments and compares them using Shared Triangles (ST) and Adjacent Shared Triangles (AST) metrics to detect potential functional domains that sequence-based approaches such as PROSITE cannot capture.

Built by Guan-Yu Chen and Yuh-Chyang Charng.

Installation


Install all required dependencies with:

pip install -r requirements.txt

Usage


  • Quick guide:

    Use your own PDB files according to the following workflow:

    01_extract_residues02_convexhull_30aa03_compare_convexhull_30aa04_summary05_rearrange_freq_match

  • Workspace:

    All scripts accept -w/--workspace to set the root working directory. It defaults to the script's own location. All other paths are resolved relative to it, so in most cases -w is the only argument you need to set.

    Assume <workspace>/pdb_files/ contains two PDB files: proteinA.pdb and proteinB.pdb.

  • Workflow:

  1. 01_extract_residues:

    Place your PDB files in <workspace>/pdb_files/ and run. Outputs one info Excel file per PDB into <workspace>/info/.

    python 01_extract_residues.py -w /path/to/workspace
    # or override individual paths:
    python 01_extract_residues.py -w /path/to/workspace -p pdb_files -o info
  2. 02_convexhull_30aa:

    Run once per PDB. Reads the info Excel from step 1 and outputs results into <workspace>/info/<pdb_name>/.

    python 02_convexhull_30aa.py -w /path/to/workspace -i info/proteinA_30aa_info.xlsx
    python 02_convexhull_30aa.py -w /path/to/workspace -i info/proteinB_30aa_info.xlsx
    # or override individual paths:
    python 02_convexhull_30aa.py -w /path/to/workspace -i info/proteinA_30aa_info.xlsx -p pdb_files -o info

    This produces info/proteinA/ and info/proteinB/.

  3. 03_compare_convexhull_30aa:

    Point to the two subfolders generated in step 2 and specify an output folder.

    python 03_compare_convexhull_30aa.py -w /path/to/workspace -1 info/proteinA -2 info/proteinB -o output
  4. 04_summary:

    python 04_summary.py -w /path/to/workspace -i output -o output/summary.xlsx
  5. 05_rearrange_freq_match:

    python 05_rearrange_freq_match.py -w /path/to/workspace -i output/summary.xlsx -o output

Add the functions of each item later

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages