PocketEngineer is a command-line tool for structure-guided directed evolution. Given a PDB structure and a ligand, it identifies the binding pocket, classifies residue interactions, estimates flexibility, and ranks mutation candidates for either activity (tighter binding / better catalysis) or flux (higher substrate throughput / kcat).
git clone https://github.com/anmol-adhav/pocketengineer.git
cd pocketengineer
conda create -n pocketengineer python=3.10 -y
conda activate pocketengineer
pip install -r requirements.txt
# Optional: pocket detection without a ligand
conda install -c conda-forge fpocket -y# From RCSB PDB ID + ligand code
bash pe.sh --pdb 1RG7 --ligand MTX --goal activity
# Specify chain explicitly (recommended for homodimers)
bash pe.sh --pdb 1HTB --ligand NAD --goal activity --chain A
# Optimise for throughput / kcat
bash pe.sh --pdb 1ERQ --ligand BJH --goal flux --chain A
# From UniProt ID (fetches AlphaFold model)
bash pe.sh --uniprot P00533 --ligand -- --goal activity
# From local PDB file with docked ligand
bash pe.sh --local my_protein_docked.pdb --ligand LIG --goal activity --chain A
# Skip evolutionary conservation scoring
bash pe.sh --pdb 3NYD --ligand 3NY --goal activity --no-conservation| Flag | Description | Default |
|---|---|---|
--pdb PDB_ID |
4-letter RCSB PDB code | — |
--uniprot ID |
UniProt accession (fetches AlphaFold) | — |
--local FILE |
Path to local .pdb / .cif file |
— |
--ligand CODE |
3-letter HETATM residue name | required |
--chain CHAIN |
Chain ID to analyse (e.g. A) |
all chains, deduplicated |
--goal |
activity or flux |
activity |
--cutoff Å |
Distance cutoff for pocket detection | 4.5 |
--no-conservation |
Skip MSA / conservation scoring | off |
--output DIR |
Output directory | outputs/ |
For each run, PocketEngineer writes to outputs/:
| File | Description |
|---|---|
{PDB}_pocket_diagram.png |
2D ligand interaction diagram |
{PDB}_mutations.csv |
Full ranked mutation table |
{PDB}_summary.txt |
Top 3 candidates with strategies |
- Fetch — Downloads PDB from RCSB or AlphaFold EBI (v4→v3→v2 fallback)
- Detect — Finds all protein residues within
--cutoffÅ of the ligand - Interact — Classifies H-bonds, hydrophobic, ionic, π-stacking contacts
- Flexibility — Uses crystallographic B-factors as proxy for local mobility
- Score — Ranks residues by a weighted formula balancing contact type, flexibility, and mutation risk
- Advise — Outputs per-residue mutation strategies tailored to the goal
activity: score = 0.4·flex + 0.35·contact_weight + 0.25·(1 - risk)
flux: score = 0.5·flex + 0.30·contact_weight + 0.20·(1 - risk)
PocketEngineer was benchmarked against experimentally validated directed evolution datasets:
| System | PDB | Known mutations | Recovered in top ranks |
|---|---|---|---|
| Kemp Eliminase HG2→HG3.17 | 3NYD | 9 | 9/9 |
| E. coli DHFR trimethoprim resistance | 1RX2 | 3 | 3/3 |
| Abl Kinase imatinib resistance | 3K5V | 3 clinical sites | 3/3 |
bash pe.sh --pdb 3NYD --ligand 3NY --goal activity --chain A --no-conservationAll 9 experimentally selected mutations from 7 rounds of directed evolution (Arnold lab, Science 2010) appear in the top 13 ranked residues.
- Python ≥ 3.9
- Biopython
- NumPy, pandas, matplotlib
- requests
- fpocket (optional, for apo structures)
If you use PocketEngineer in your research, please cite:
[Anmol Adhav] (2026). PocketEngineer: structure-guided directed evolution ranking. GitHub: https://github.com/anmol-adhav/pocketengineer
MIT License