Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
5f3cdff
Added manual function for running github actions and added pipeline e…
okkevaneck Aug 12, 2023
48de941
Forget variable sign
okkevaneck Aug 12, 2023
e57944d
Another try
okkevaneck Aug 12, 2023
eeab8e1
Another one
okkevaneck Aug 12, 2023
36fb5c3
Different appraoch
okkevaneck Aug 12, 2023
260b476
Changed usage
okkevaneck Aug 12, 2023
0b03a58
Removed brackets to don't reploy on push for non-master branches
okkevaneck Aug 12, 2023
666d893
Removed condition statements, as they might already be added by the i…
okkevaneck Aug 12, 2023
a06c2ef
Changed to string eval of true as github actions is possibly broken b…
okkevaneck Aug 12, 2023
f988140
Removed string eval as it appearently works
okkevaneck Aug 12, 2023
021e74d
Fix/locate version module from @rubenhorn (#58)
okkevaneck Aug 12, 2023
a44e3b8
Added specific typing for the pull_request workflow
okkevaneck Aug 12, 2023
8145e22
Merge branch 'develop' of github.com:OkkeVanEck/prospr into develop
okkevaneck Aug 12, 2023
386c327
Fixed typo
okkevaneck Aug 12, 2023
0d2e838
Added dependencies with develop as root branch (#59)
okkevaneck Aug 12, 2023
b37a36b
Upped version
okkevaneck Aug 12, 2023
d193104
Merged master into develop. Shouldn't be needed, but here we are.
okkevaneck Oct 22, 2025
212b9b7
Updated the archives
okkevaneck Oct 22, 2025
1bf2d5d
Added export_protein(protein, path) for writing PDB files (#71)
rubenhorn Oct 23, 2025
25babe5
Upated Mol* link syntax
okkevaneck Oct 26, 2025
0dc5b5b
Enable PDF generation in Read the Docs config (#74)
rubenhorn Oct 26, 2025
4671b21
Removed the push trigger for CI/CD to prevent unwanted executions.
okkevaneck Oct 26, 2025
b8ac39b
Bumped version of Numpy to v2.0.0 in order to overcome multiarray re-…
okkevaneck Oct 26, 2025
2677bb5
Changed github.ref to github.base_ref to allow for deployment on merg…
okkevaneck Oct 26, 2025
84b1447
Added an actions trigger for push on master branch
okkevaneck Oct 26, 2025
a1e50c1
Merge branch 'master' into develop
okkevaneck Oct 26, 2025
5bcb922
Merge branch 'master' of github.com:OkkeVanEck/prospr into develop
okkevaneck Nov 12, 2025
a8ce69c
Bumped version for deploy on master
okkevaneck Nov 12, 2025
4bf584f
Upgraded the pre-commit config to be functional
okkevaneck Nov 12, 2025
4df9c55
Bumped version for future merge
okkevaneck Nov 12, 2025
873811c
Ran pre-commit
okkevaneck Nov 12, 2025
7fbd9a2
Checkpoints for depth_first_bnb (#77) - Author: Ruben Horn
rubenhorn Dec 7, 2025
10ede7c
Merge branch 'master' into develop
okkevaneck Dec 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ exclude: >
(?x)^(
.idea/.*|
)$
default_stages: [commit]
default_stages: [pre-commit]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
Expand Down
3 changes: 2 additions & 1 deletion docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,8 @@ can be imported from the *prospr.helpers* submodule, e.g.

| **export_protein**\ (*protein, path*)
| Save conformation of a protein in Protein Data Bank (PDB) file format
for processing or visualization with external software such as `Mol* <https://molstar.org/>`_.
for processing or visualization with external software such as
`Mol* <https://molstar.org/>`_.
| *Parameters:*
| * **protein** - *Protein*: Protein object to save the hash of.
| * **path** - *os.PathLike or str*: The path of the output file.
Expand Down
20 changes: 20 additions & 0 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,26 @@ be easily used via a direct import, as is shown below.
p_2d.hash_fold()
>> [1, 2, -1]

Checkpoints
-------------------
The algorithm *depth_first_bnb(protein)* supports checkpoints to resume an interrupted search
by storing the state of the protein and the algorithm to a file, after a signal
(*SIGTERM* or *SIGINT*) is received.

.. code-block:: python

import os

from prospr import Protein, depth_first_bnb

os.environ["PROSPR_CACHE_DIR"] = "/tmp/prospr"

p_2d = Protein("HPPH")
# Will read/write /tmp/prospr/depth_first_bnb/HPPH.checkpoint
depth_first_bnb(p_2d)
print("Done.")
>>>

Visualizing conformations
-------------------------
Visualizing conformations can be key to understanding how the resulting
Expand Down
2 changes: 1 addition & 1 deletion prospr/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "1.2.8"
__version__ = "1.2.9"
213 changes: 210 additions & 3 deletions prospr/core/src/depth_first_bnb.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,22 @@
*/

#include "depth_first_bnb.hpp"

#include <math.h>
#include "utils.hpp"

#include <algorithm>
#include <atomic>
#include <csignal>
#include <fstream>
#include <iostream>
#include <math.h>
#include <numeric>
#include <sstream>
#include <stack>
#include <vector>

/* Global flag for custom handling of SIGINT. */
std::atomic<int> COUGHT_SIGNAL{0};

/* All possible variables required by custom pruning. */
struct prune_vars {
size_t max_length;
Expand Down Expand Up @@ -90,10 +98,176 @@ bool reach_prune(Protein *protein, int move, int best_score,
return cur_score + branch_score >= best_score;
}

/* If a checkpoint location is provided through the environment, attempt to
* store the checkpoint.
*/
void try_store_checkpoint(const Protein &protein,
const std::stack<int> &dfs_stack, int move,
bool placed_amino, int best_score, int score,
const std::vector<int> &best_hash, int iterations) {
/* Return if cache not in use. */
auto cache_dir = get_cache_dir("depth_first_bnb", true);
if (!cache_dir) {
#ifdef PROSPR_DEBUG_STEPS
std::cout
<< "[Debug depth_first_bnb] No cache directory to save checkpoint to."
<< std::endl;
#endif
return;
}

/* If cache is in use, try writing to checkpoint. */
std::string filename =
*cache_dir + PATH_SEPARATOR + protein.get_sequence() + ".checkpoint";
#ifdef PROSPR_DEBUG_STEPS
std::cout << "[Debug depth_first_bnb] Writing to checkpoint: " << filename
<< std::endl;
#endif
std::ofstream ofs(filename);
if (!ofs)
throw std::runtime_error("Cannot open checkpoint file for writing.");

ofs << "; prospr checkpoint for sequence " << protein.get_sequence() << "\n";
ofs << "; Protein state:\n";
dump_protein_state(protein, ofs);
ofs << "\n; Algorithm state:\n";
ofs << "algorithm=depth_first_bnb\n";

/* Serialize stack as comma-separated list. */
std::stack<int> temp = dfs_stack;
std::vector<int> stack_data;
while (!temp.empty()) {
stack_data.push_back(temp.top());
temp.pop();
}
std::reverse(stack_data.begin(), stack_data.end());
ofs << "dfs_stack=";
for (size_t i = 0; i < stack_data.size(); ++i) {
if (i != 0)
ofs << ",";
ofs << stack_data[i];
}
ofs << "\n";

ofs << "move=" << move << "\n";
ofs << "placed_amino=" << placed_amino << "\n";
ofs << "best_score=" << best_score << "\n";
ofs << "score=" << score << "\n";

ofs << "best_hash=";
for (size_t i = 0; i < best_hash.size(); ++i) {
if (i != 0)
ofs << ",";
ofs << best_hash[i];
}
ofs << "\n";
ofs << "iterations=" << iterations << "\n";
}

/* If a checkpoint location is provided through the environment, attempt to load
* the checkpoint.
*/
void try_load_checkpoint(Protein &protein, std::stack<int> &dfs_stack,
int &move, bool &placed_amino, int &best_score,
int &score, std::vector<int> &best_hash,
int &iterations) {
/* Return if cache not in use. */
auto cache_dir = get_cache_dir("depth_first_bnb");
if (!cache_dir) {
#ifdef PROSPR_DEBUG_STEPS
std::cout
<< "[Debug depth_first_bnb] No cache directory to load checkpoint from."
<< std::endl;
#endif
return;
}

/* If cache is in use, try loading to checkpoint. */
std::string filename =
*cache_dir + PATH_SEPARATOR + protein.get_sequence() + ".checkpoint";
if (!file_exists(filename)) {
#ifdef PROSPR_DEBUG_STEPS
std::cout << "[Debug depth_first_bnb] No checkpoint to load:" << filename
<< std::endl;
#endif
return;
}

#ifdef PROSPR_DEBUG_STEPS
std::cout << "[Debug depth_first_bnb] Reading from checkpoint: " << filename
<< std::endl;
#endif
std::ifstream ifs(filename);
if (!ifs)
throw std::runtime_error("Cannot open checkpoint file for reading.");

/* Load the protein state. */
load_protein_state(protein, ifs);

/* Read the file again for loading the algorithm state. */
ifs.clear();
ifs.seekg(0, std::ios::beg);

std::string line;
while (std::getline(ifs, line)) {
std::string key;
std::string value;
if (!parse_ini_line(line, key, value))
continue;

if (key == "dfs_stack") {
/* Clear the stack. */
dfs_stack = std::stack<int>();
std::vector<int> stack_data;
std::stringstream ss(value);
std::string token;
while (std::getline(ss, token, ',')) {
stack_data.push_back(std::stoi(token));
}

/* Rebuild the stack. */
for (int v : stack_data)
dfs_stack.push(v);
} else if (key == "algorithm" && value != "depth_first_bnb") {
#ifdef PROSPR_DEBUG_STEPS
std::cerr << "[Debug depth_first_bnb] Unexpected value for checkpoint "
"algorithm: "
<< value << std::endl;
#endif
} else if (key == "move")
move = std::stoi(value);
else if (key == "placed_amino")
placed_amino = std::stoi(value);
else if (key == "best_score")
best_score = std::stoi(value);
else if (key == "score")
score = std::stoi(value);
else if (key == "best_hash") {
best_hash.clear();
std::stringstream ss(value);
std::string token;
while (std::getline(ss, token, ',')) {
best_hash.push_back(std::stoi(token));
}
} else if (key == "iterations")
iterations = std::stoi(value) - 1;
}
}

/* Function to catch signals (SIGTERM, SIGINT) and store them for delayed
* handling. */
void signal_handler(int signal) {
COUGHT_SIGNAL.store(signal, std::memory_order_relaxed);
}

/* A depth-first branch-and-bound search function for finding a minimum
* energy conformation.
*/
void depth_first_bnb(Protein *protein, std::string prune_func) {
/* Override signal handlers. */
void (*signal_handler_sigint)(int) = std::signal(SIGINT, signal_handler);
void (*signal_handler_sigterm)(int) = std::signal(SIGTERM, signal_handler);

protein->reset_conformation();
size_t max_length = protein->get_sequence().length();
int dim = protein->get_dim();
Expand Down Expand Up @@ -158,7 +332,30 @@ void depth_first_bnb(Protein *protein, std::string prune_func) {
int score;
std::vector<int> best_hash;

int signal = 0;
int iterations = 0;

/* Load intermediate solution from cache if present. */
try_load_checkpoint(*protein, dfs_stack, move, placed_amino, best_score,
score, best_hash, iterations);
#ifdef PROSPR_DEBUG_STEPS
std::cout << "[Debug depth_first_bnb] Algorithm starting from iteration "
<< iterations << "." << std::endl;
#endif

do {
/* Break if a signal was caught. */
signal = COUGHT_SIGNAL.exchange(0);
if (signal)
break;

iterations++;
#ifdef PROSPR_DEBUG_STEPS
std::cout << "[Debug depth_first_bnb] Paused before iteration "
<< iterations << ". (Press enter to continue!) " << std::flush;
std::cin.get();
#endif

placed_amino = false;

/* Try to place the current amino acid. */
Expand Down Expand Up @@ -214,6 +411,16 @@ void depth_first_bnb(Protein *protein, std::string prune_func) {
}
} while (move != -dim - 1 || !dfs_stack.empty());

/* Set best found conformation and return protein. */
/* Write possible temporary solution to cache, if available. */
try_store_checkpoint(*protein, dfs_stack, move, placed_amino, best_score,
score, best_hash, iterations);

/* Set best found conformation. */
protein->set_hash(best_hash);

/* Restore signal handlers and propagate caught signal. */
std::signal(SIGINT, signal_handler_sigint);
std::signal(SIGTERM, signal_handler_sigterm);
if (signal)
std::raise(signal);
}
22 changes: 17 additions & 5 deletions prospr/core/src/protein.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ Protein &Protein::operator=(const Protein &other) {
}

/* Returns the Protein's sequence. */
std::string Protein::get_sequence() { return sequence; }
std::string Protein::get_sequence() const { return sequence; }

/* Returns the Protein's set maximum dimension. */
int Protein::get_dim() { return dim; }
Expand Down Expand Up @@ -212,10 +212,22 @@ AminoAcid *Protein::get_amino(std::vector<int> position) {
int Protein::get_score() { return score; }

/* Returns the number of checked solutions. */
std::uint64_t Protein::get_solutions_checked() { return solutions_checked; }
std::uint64_t Protein::get_solutions_checked() const {
return solutions_checked;
}

/* Set the number of checked solutions. */
void Protein::_set_solutions_checked(std::uint64_t checked) {
solutions_checked = checked;
}

/* Returns the number of amino acids placed. */
std::uint64_t Protein::get_aminos_placed() { return aminos_placed; }
std::uint64_t Protein::get_aminos_placed() const { return aminos_placed; }

/* Set the number of amino acids placed. */
void Protein::_set_aminos_placed(std::uint64_t placed) {
aminos_placed = placed;
}

/* Returns if the amino acid at the given index is weighted. */
bool Protein::is_weighted(size_t index) {
Expand Down Expand Up @@ -340,7 +352,7 @@ void Protein::remove_amino() {
}

/* Hash and return the fold of the current conformation. */
std::vector<int> Protein::hash_fold() {
std::vector<int> Protein::hash_fold() const {
std::vector<int> fold_hash;
std::vector<int> cur_pos(dim, 0);
AminoAcid *cur_amino;
Expand Down Expand Up @@ -383,7 +395,7 @@ void Protein::_change_score(int move, bool placed) {
std::vector<int> moves;

for (int i = -dim; i <= dim; i++) {
if (i != 0 and i != -move)
if (i != 0 && i != -move)
moves.push_back(i);
}

Expand Down
Loading