-
Notifications
You must be signed in to change notification settings - Fork 83
Description
Hey Simplefold team,
Thanks for publishing this tool - it works great if installed according to the instructions in the readme.
However, the structure of accessing model architecture YAMLs via the configs/ directory in this repo doesn't allow for simplefold to be run from another directory or included as a dependency for another project.
Example 1 - simplefold can't be included as a package dependency
For example, if I install my own package "folding-tools" (via poetry), where I include simplefold as a dependency from this git repo, the config files aren't included in my installation of simplefold.
[project]
name = "folding-tools"
version = "0.0.0"
description = "folding"
authors = [
{name = "John Sterrett", email = "jsterrett@idtdna.com"}
]
requires-python = ">=3.10,<3.13"
dependencies = [
"simplefold @ git+ssh://git@github.com/apple/ml-simplefold.git",
"mlx==0.28.0",
"fair-esm @ git+ssh://git@github.com/facebookresearch/esm.git"
]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"This will install simplefold without any errors, but running the simplefold script raises a runtime error
# assume local directory /path/to/dir/folding-tools
# install
conda create -n protein_folding_dev python=3.10
conda activate protein_folding_dev
pip install poetry
poetry install
# run simplefold script
simplefold --simplefold_model "simplefold_100M" --num_steps 100 --tau 0.01 --nsample_per_protein 1 --plddt --fasta_path test.fa --output_dir simplefold_output_test --backend mlxError:
Running protein folding with SimpleFold ...
Seed set to 42
Traceback (most recent call last):
File "/path/to/miniconda3/envs/protein_folding_dev/bin/simplefold", line 8, in <module>
sys.exit(main())
File "/path/to/miniconda3/envs/protein_folding_dev/lib/python3.10/site-packages/simplefold/cli.py", line 39, in main
predict_structures_from_fastas(args)
File "/path/to/miniconda3/envs/protein_folding_dev/lib/python3.10/site-packages/simplefold/inference.py", line 271, in predict_structures_from_fastas
model, device = initialize_folding_model(args)
File "/path/to/miniconda3/envs/protein_folding_dev/lib/python3.10/site-packages/simplefold/inference.py", line 72, in initialize_folding_model
model_config = omegaconf.OmegaConf.load(cfg_path)
File "/path/to/miniconda3/envs/protein_folding_dev/lib/python3.10/site-packages/omegaconf/omegaconf.py", line 189, in load
with io.open(os.path.abspath(file_), "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/dir/folding-tools/configs/model/architecture/foldingdit_100M.yaml'
Source of error
The following lines in simplefold cause this error by referencing a file that wasn't included in the package build. They look locally for the configs/directory.
ml-simplefold/src/simplefold/inference.py
Line 65 in 27ca3af
| cfg_path = os.path.join("configs/model/architecture", f"foldingdit_{simplefold_model[11:]}.yaml") |
ml-simplefold/src/simplefold/inference.py
Line 103 in 27ca3af
| plddt_module_path = "configs/model/architecture/plddt_module.yaml" |
ml-simplefold/src/simplefold/inference.py
Line 131 in 27ca3af
| plddt_latent_config_path = "configs/model/architecture/foldingdit_1.6B.yaml" |
Example 2 - simplefold can't be run from another directory
If I install simple fold according to the readme, cd .., then try to run the simplefold script the new directory, there are also issues due to the relative path to the configs/ dir.
simplefold --simplefold_model "simplefold_100M" --num_steps 100 --tau 0.01 --nsample_per_protein 1 --plddt --fasta_path test.fa --output_dir simplefold_output_test --backend mlx --ckpt_dir ml-simplefold/artifacts
Running protein folding with SimpleFold ...
Traceback (most recent call last):
File "/path/to/miniforge3/envs/simplefold/bin/simplefold", line 7, in <module>
sys.exit(main())
File "/path/to/ml-simplefold/src/simplefold/cli.py", line 38, in main
predict_structures_from_fastas(args)
File "/path/to/ml-simplefold/src/simplefold/inference.py", line 267, in predict_structures_from_fastas
model, device = initialize_folding_model(args)
File "/path/to/ml-simplefold/src/simplefold/inference.py", line 78, in initialize_folding_model
with open(cfg_path, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'configs/model/architecture/foldingdit_100M.yaml'
Allowing simplefold script to be run from anywhere
In my fork, I've copied the relevant config files into src/simplefold/configs and used importlib.resources to access these from anywhere. Now, simplefold can be included as a dependency by other packages.
[project]
name = "folding-tools"
version = "0.0.0"
description = "folding"
authors = [
{name = "John Sterrett", email = "jsterrett@idtdna.com"}
]
requires-python = ">=3.10,<3.13"
dependencies = [
"simplefold @ git+ssh://git@github.com/sterrettjd-idt/ml-simplefold.git",
"mlx==0.28.0",
"fair-esm @ git+ssh://git@github.com/facebookresearch/esm.git"
]
[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"I can open a PR with these changes so you can see them - would it be possible to incorporate these so that simplefold can be more widely used as a dependency for other tools?
Thanks!
John