Skip to content

anmol-adhav/PocketEngineer

Repository files navigation

PocketEngineer 🧬

PocketEngineer is a command-line tool for structure-guided directed evolution. Given a PDB structure and a ligand, it identifies the binding pocket, classifies residue interactions, estimates flexibility, and ranks mutation candidates for either activity (tighter binding / better catalysis) or flux (higher substrate throughput / kcat).


Installation

git clone https://github.com/anmol-adhav/pocketengineer.git
cd pocketengineer
conda create -n pocketengineer python=3.10 -y
conda activate pocketengineer
pip install -r requirements.txt
# Optional: pocket detection without a ligand
conda install -c conda-forge fpocket -y

Quick Start

# From RCSB PDB ID + ligand code
bash pe.sh --pdb 1RG7 --ligand MTX --goal activity

# Specify chain explicitly (recommended for homodimers)
bash pe.sh --pdb 1HTB --ligand NAD --goal activity --chain A

# Optimise for throughput / kcat
bash pe.sh --pdb 1ERQ --ligand BJH --goal flux --chain A

# From UniProt ID (fetches AlphaFold model)
bash pe.sh --uniprot P00533 --ligand -- --goal activity

# From local PDB file with docked ligand
bash pe.sh --local my_protein_docked.pdb --ligand LIG --goal activity --chain A

# Skip evolutionary conservation scoring
bash pe.sh --pdb 3NYD --ligand 3NY --goal activity --no-conservation

Arguments

Flag Description Default
--pdb PDB_ID 4-letter RCSB PDB code
--uniprot ID UniProt accession (fetches AlphaFold)
--local FILE Path to local .pdb / .cif file
--ligand CODE 3-letter HETATM residue name required
--chain CHAIN Chain ID to analyse (e.g. A) all chains, deduplicated
--goal activity or flux activity
--cutoff Å Distance cutoff for pocket detection 4.5
--no-conservation Skip MSA / conservation scoring off
--output DIR Output directory outputs/

Output

For each run, PocketEngineer writes to outputs/:

File Description
{PDB}_pocket_diagram.png 2D ligand interaction diagram
{PDB}_mutations.csv Full ranked mutation table
{PDB}_summary.txt Top 3 candidates with strategies

How It Works

  1. Fetch — Downloads PDB from RCSB or AlphaFold EBI (v4→v3→v2 fallback)
  2. Detect — Finds all protein residues within --cutoff Å of the ligand
  3. Interact — Classifies H-bonds, hydrophobic, ionic, π-stacking contacts
  4. Flexibility — Uses crystallographic B-factors as proxy for local mobility
  5. Score — Ranks residues by a weighted formula balancing contact type, flexibility, and mutation risk
  6. Advise — Outputs per-residue mutation strategies tailored to the goal

Priority Score Formula

activity:  score = 0.4·flex + 0.35·contact_weight + 0.25·(1 - risk)
flux:      score = 0.5·flex + 0.30·contact_weight + 0.20·(1 - risk)

Validation

PocketEngineer was benchmarked against experimentally validated directed evolution datasets:

System PDB Known mutations Recovered in top ranks
Kemp Eliminase HG2→HG3.17 3NYD 9 9/9
E. coli DHFR trimethoprim resistance 1RX2 3 3/3
Abl Kinase imatinib resistance 3K5V 3 clinical sites 3/3

Example: Kemp Eliminase

bash pe.sh --pdb 3NYD --ligand 3NY --goal activity --chain A --no-conservation

All 9 experimentally selected mutations from 7 rounds of directed evolution (Arnold lab, Science 2010) appear in the top 13 ranked residues.


Requirements

  • Python ≥ 3.9
  • Biopython
  • NumPy, pandas, matplotlib
  • requests
  • fpocket (optional, for apo structures)

Citation

If you use PocketEngineer in your research, please cite:

[Anmol Adhav] (2026). PocketEngineer: structure-guided directed evolution ranking. GitHub: https://github.com/anmol-adhav/pocketengineer


License

MIT License

About

PocketEngineer is a command-line tool for structure-guided directed evolution.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors