Skip to content

TCoulth/AcidoMPNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

AcidoMPNN

Retraining of ProteinMPNN model specifically with acid-stable structures and sequences

Trained using the HyperMPNN training scripts. Also used most of the same sequence selection logic (ie same clustering and quality cutoffs).

Structure Selection

Sequences were filtered by organism, selecting for acidophiles, and then for the prescence of a secretion tag. Since organisms can maintain a pH inside the cell different than that outside, proteins with secretion tags from acidophiles were mostly likely to be present in low pH environments. After clustering at 50% sequence identity, AF2 structures were gathered and filtered by quality (>70% plddt).

Training

21129 total sequence/structures used. 80-10-10 training-validation-test split

image image

Testing Results

Note: Has not been experimentally tested yet. Please try!

Generated sequences were folded with high confidence with AF2. Amino acid compositions are distinct from ProteinMPNN and HyperMPNN distributions. More analysis to follow

Use

Point ProteinMPNN to the acidompnn .pt file.

About

Retraining of ProteinMPNN model specifically with acid-stable structures and sequences

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors