diff --git a/docs/figures/geometry_equi_md_sampling.svg b/docs/figures/geometry_equi_md_sampling.svg new file mode 100644 index 0000000..90f9243 --- /dev/null +++ b/docs/figures/geometry_equi_md_sampling.svg @@ -0,0 +1,4 @@ + + + +
Trajectories: List[StructureEnsemble]
equi_md_sampling
stru: Structure
param_method: MolDynParameterizer
\ No newline at end of file diff --git a/docs/source/index.rst b/docs/source/index.rst index de348e7..e08f8f0 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -52,9 +52,12 @@ This user guide contains: :caption: Tutorials sci_api_tutorial/how_to_assemble + sci_api_tutorial/preparation sci_api_tutorial/preparation_remove_solvent sci_api_tutorial/preparation_remove_hydrogens sci_api_tutorial/preparation_protonate_stru + sci_api_tutorial/geometry_mol_dyn_param + sci_api_tutorial/geometry_equi_md_sampling sci_api_tutorial/single_point sci_api_tutorial/add_missing_residues sci_api_tutorial/assign_mutant diff --git a/docs/source/sci_api_tutorial/assign_mutant.rst b/docs/source/sci_api_tutorial/assign_mutant.rst index 5fe9075..8918abc 100644 --- a/docs/source/sci_api_tutorial/assign_mutant.rst +++ b/docs/source/sci_api_tutorial/assign_mutant.rst @@ -1,5 +1,5 @@ ============================================== -Assign Mutant + Assign Mutant ============================================== Briefs diff --git a/docs/source/sci_api_tutorial/geometry_equi_md_sampling.rst b/docs/source/sci_api_tutorial/geometry_equi_md_sampling.rst new file mode 100644 index 0000000..c35f837 --- /dev/null +++ b/docs/source/sci_api_tutorial/geometry_equi_md_sampling.rst @@ -0,0 +1,252 @@ +======================================================= +Geometry: Equilibrium Molecular Dynamics Sampling +======================================================= + +Briefs +============================================== + +This science API, named ``enzy_htp.geometry.equi_md_sampling``, +performs a production run of Molecular Dynamics Simulation (hereinafter called **MD Simulation**) +with the system equilibrated by several short md simulations from the starting ``enzy_htp.structure.Structure`` class instance +(hereafter referred to as ``Structure`` instance). + +.. dropdown:: :fa:`eye,mr-1` Click to learn more about **Equilibrium Molecular Dynamics Sampling** + + (Basically md_simulation() with preset steps) + Minimalization (micro) -> Heating (NVT) -> Equilibrium (NPT) -> Production (NPT) + +Input/Output +============================================== + +.. panels:: + + :column: col-lg-12 col-md-12 col-sm-12 col-xs-12 p-2 text-left + + .. image:: ../../figures/geometry_equi_md_sampling.svg + :width: 100% + :alt: preparation_remove_solvent + +**input**: A well-preparaed ``Structure`` instance (no matter it's a protein, polypeptite, or ligand) and a ``MolDynParameterizer`` instance. + +.. admonition:: How to obtain well-preparaed ``Structure`` instance + + A ``Structure`` instance can be obtained by these `APIs `_. + + Note: Structure(s) with missing loops are not acceptable. + + To prepare structure, please refer to these `APIs `_. + +.. admonition:: How to compose ``MolDynParameterizer`` instance + + The ``MolDynParameterizer`` class is a parameterizer for Molecular Dynamics simulation. + + For detailed instructions, see `Molecular Dynamics Parameterizer `_. + +**output**: A list of ``StructureEnsemble`` instances, i.e. a list trajectories for each replica in StructureEnsemble format. + +Arguments +============================================== + +``stru`` + The input ``Structure`` instance (no matter it's a protein, polypeptite, or ligand). + + (See `Input/Output <#input-output>`_ section) + +``param_method`` + The ``MolDynParameterizer`` instance for parameterization, constructed by ``Parameterizer()``, which determines the engine. + + (See `Input/Output <#input-output>`_ section) + +``parallel_runs`` + The number of desired parallel runs of the steps. + + (Integer, optional, default ``3``) + +``parallel_method`` + The method to parallelize the multiple runs. + + (String, optional, default ``cluster_job``) + +``work_dir`` + The directory that saves all the MD input/intermediate/output files. + + (String, optional, default ``./MD``) + +``prod_time`` + The simulation time in production step (unit: ns) + + (Float, optional, default ``50.0``) + +``prod_temperature`` + The production temperature (unit: K). + + (Float, optional, default ``300.0``) + +``prod_constrain`` + The constrain applied in the production step. + + (``List[structure_constraint.StructureConstraint]``, optional, default ``None``) + + .. dropdown:: :fa:`eye,mr-1` Click to learn more about ``StructureConstraint`` + + ``StructureConstraint`` is a class from ``enzy_htp.structure.structure_constraint`` module, defining the API for a constraint. + + Each primitive StructureConstraint defines exactly one type of interaction. + + StructureConstraints are meant to define flexible, non-package specific relationships that can be translated + in between different software packages. + +``record_period`` + The simulation time period for recording the geometry. (unit: ns) + + (Float, optional, default ``0.5``) + +``cluster_job_config`` + The config for cluster_job if it is used as the parallel method. + + (Dictionary, optional, default ``None``) + + .. dropdown:: :fa:`eye,mr-1` Click to learn more about ``cluster_job_config`` + + The value of this argument depends on the settings of the supercomputer/cluster you use. + +``cpu_equi_step`` + Whether to use CPUs for equilibrium step. + + (Boolean, optional, default ``False``) + + .. dropdown:: :fa:`eye,mr-1` Click to learn more about ``cpu_equi_step`` + + XXX + +``cpu_equi_job_config`` + The job config for the CPU equilibrium step if specified, functions when ``cpu_equi_step=False``. + + (Dictionary, optional, default ``None``) + + .. dropdown:: :fa:`eye,mr-1` Click to learn more about ``cpu_equi_job_config`` + + XXX + +``job_check_period`` + The check period for wait_to_2d_array_end, functions when ``parallel_method='cluster_job'``. (unit: s) + + (Integer, optional, default ``210``) + + +Examples +============================================== + +Prepare the Input: Load Structure +---------------------------------------------- + +In order to make use of the API, we should have structure loaded. + +.. code:: python + + import enzy_htp.structure as struct + + sp = struct.PDBParser() + + pdb_filepath = "/path/to/your/structure.pdb" + stru = sp.get_structure(pdb_filepath) + +Execute API +---------------------------------------------- + +Use ``geometry.equi_md_sampling`` to implement Equilibrium MD Simulation. + +.. code:: python + + import enzy_htp.structure as struct + + sp = struct.PDBParser() + + pdb_filepath = "/path/to/your/structure.pdb" + stru = sp.get_structure(pdb_filepath) + + from enzy_htp.core.clusters.accre import Accre + from enzy_htp.geometry import md_simulation, equi_md_sampling + from enzy_htp import interface + + amber_interface = interface.amber + + param_method = amber_interface.build_md_parameterizer() + cluster = AccreR9() # This is the interface for operating Vanderbilt University's Advanced Computational Clust + # You can customize a new class in `enzy_htp.core_cluster` folder so as + # to have it compatible to the computational cluster resources in your own institution(s). + cluster_job_config = { + "cluster" : cluster, + "res_keywords" : { + "account" : "csb_gpu_acc", + "partition" : "batch_gpu", + "nodes": "1", + "node_cores" : "nvidia_rtx_a6000:2", + } + } + md_result = equi_md_sampling( + stru = stru, + param_method=param_method, + cluster_job_config=cluster_job_config, + job_check_period=10, + prod_time=0.5, + record_period=0.05) + +.. note:: + + Here, we execute MD simulation with a very short ``prod_time`` for example use. + + In real cases, the ``prod_time`` will usually be 30 ns ~ 110 ns. + +Check the Output +---------------------------------------------- + +Let's try executing the API here and check if there's any changes taking place. + +.. panels:: + + :column: col-lg-12 col-md-12 col-sm-12 col-xs-12 p-2 text-left + + Here, we use a well-preparaed complex containing SARS-Cov-2 Main Protease and Nirmatrelvir for example. + + .. code:: python + + import enzy_htp.structure as struct + + sp = struct.PDBParser() + + pdb_filepath = "7si9_rm_water_aH.pdb" + stru = sp.get_structure(pdb_filepath) + + from enzy_htp.core.clusters.accre import Accre + from enzy_htp.geometry import md_simulation, equi_md_sampling + from enzy_htp import interface + + amber_interface = interface.amber + + param_method = amber_interface.build_md_parameterizer() + cluster = AccreR9() # This is the interface for operating Vanderbilt University's Advanced Computational Clust + # You can customize a new class in `enzy_htp.core_cluster` folder so as + # to have it compatible to the computational cluster resources in your own institution(s). + cluster_job_config = { + "cluster" : cluster, + "res_keywords" : { + "account" : "csb_gpu_acc", + "partition" : "batch_gpu", + "nodes": "1", + "node_cores" : "nvidia_rtx_a6000:2", + } + } + md_result: List[StructureEnsemble] = equi_md_sampling( + stru = stru, + param_method=param_method, + cluster_job_config=cluster_job_config, + job_check_period=10, + prod_time=0.5, + record_period=0.05) + + len(md_result) # 3. + +We may notice that the MD simulation has generated 3 replicas and stored in ``md_result``. + +Author: Zhong, Yinjie diff --git a/docs/source/sci_api_tutorial/geometry_mol_dyn_param.rst b/docs/source/sci_api_tutorial/geometry_mol_dyn_param.rst new file mode 100644 index 0000000..890ad75 --- /dev/null +++ b/docs/source/sci_api_tutorial/geometry_mol_dyn_param.rst @@ -0,0 +1,141 @@ +=========================================================== +Geometry: Molecular Dynamics Parameter and Parameterizer +=========================================================== + +Briefs +============================================== + +This module, named ``enzy_htp._interface.handle_types.mol_dyn_parameterizer``, defines the interface of +``MolDynParameterizer`` and ``MolDynParameter`` as abstract classes for Molecular Dynamics Simulation (hereinafter called **MD Simulation**). + +In order to implement MD simulation, we need to apply the force fields to the complex, define to solvent box, etc., +which is the so-called "parameterization". However, different MD engines have different parameterization requirements. + +Thus, we use concrete class of ``MolDynParameterizer`` and ``MolDynParameter`` for specific MD Simulation engines. + +.. dropdown:: :fa:`eye,mr-1` Click to learn more about **MD Simulation engines** + + Currently, the only available engine for MD simulation is Amber. + + Thus, the only concrete class of ``MolDynParameterizer`` at present is ``AmberParameterizer``. + +Concrete Implementations of ``MolDynParameterizer`` +======================================================= + +AmberParameterizer +---------------------------------------------- + +This class is the MD simulation parameterizer for Amber. + +The recommended constructor of ``AmberParameterizer`` is ``enzy_htp._interface.amber_interface.AmberInterface.build_md_parameterizer`` + +Arguments for Constructor +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``force_fields`` + The list of force fields used for parameterization in Amber tleap format (default: ["leaprc.protein.ff14SB", "leaprc.gaff", "leaprc.water.tip3p"]) + + The argument `force_fields` should be a List[str] value or 'default', which is the only allowed string value for `force_fields`. + + (List[str], optional, default ``["leaprc.protein.ff14SB", "leaprc.gaff2", "leaprc.water.tip3p"]``) + +``charge_method`` + The method used for determine the atomic charge. + + This method is applied to parameterization of ligand, modified AA, and metal binding site. + + (String, optional, default ``AM1BCC``) + +``resp_engine`` (Works when ``charge_method="RESP"``.) + The engine for calculating the RESP charge. + + (String, optional, default ``g16``) + +``resp_lvl_of_theory`` (Works when ``charge_method="RESP"``.) + The level of theory for calculating the RESP charge + + (String, optional, default ``b3lyp/def2svp em=d3``) + +``ncaa_param_lib_path`` + The path of the non-CAA parameter library. + + This is where all generated NCAA (Non-Canonical Amino Acid) params goes to, which will prevent redundant generation of same NCAAs. + + (String, optional, default ``../ncaa_lib``, relative to the working directory) + + .. dropdown:: :fa:`eye,mr-1` Click to learn more about ``ncaa_param_lib_path`` + + Normally we suggest setting this to a directory that contains all workflows of a same wild-type/template enzyme. + + * The NCAA-file correspondence is determined by + + 1. 3-letter name in the file; + 2. The file name (if 1 not exist); + + * Setting this to a path that is too general may cause conflict when different NCAAs have the same name. + + (e.g. Different tautomer or general res name like LIG) + +``force_renew_ncaa_parameter`` + Whether force renew the parameter files (frcmod etc.) for all NCAAs (Ligands, Modified Amino Acids, or Metals) + + (Boolean, optional, default ``False``) + +``ncaa_net_charge_engine`` + The engine the determines the net charge of NCAA if none is assigned in NCAA objects (Ligands, Modifed Residues, Metal Units) + + (String, optional, default ``PYBEL``) + +``ncaa_net_charge_ph`` + The pH value used in determining the net charge of NCAA. + + (Float, optional, default ``7.0``) + +``solvate_box_type`` + The shape of the solvation box. + + (String, optional, default ``oct``) + +``solvate_box_size`` + The size of the solvation box (Unit: Angstrom). + + (Float, optional, default ``10.0``) + +``gb_radii`` + The igb number - the effective GB radii used in the Generalized Born calculation. + This will influence the GB radii in the prmtop file and are only used implicit solvent calculations. + + (Integer, optional, default ``None``) + +``parameterizer_temp_dir`` + The temporary working directory that contains all the files generated by the AmberParameterizer. + + (String, optional, default ``{SCRATCH_DIR}/amber_parameterizer``) + + .. admonition:: About ``SCRATCH_DIR`` + + ``SCRATCH_DIR`` is a directory for scratch use, which can be defined by yourselves. + +``additional_tleap_lines`` + Handle for adding additional tleap lines before generating the parameters. + + (List[str], optional, default ``None``) + + +Examples +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The simpliest use of the constructor is as follows. + +.. code:: python + + from enzy_htp import interface + amber_interface = interface.amber + + param_method = amber_interface.build_md_parameterizer() + + type(param_method) # + +We can learn from the output that an ``AmberParameterizer`` instance has been constructed by ``build_md_parameterizer`` function. + +Author: Zhong, Yinjie \ No newline at end of file diff --git a/docs/source/sci_api_tutorial/preparation.rst b/docs/source/sci_api_tutorial/preparation.rst new file mode 100644 index 0000000..b4bcc99 --- /dev/null +++ b/docs/source/sci_api_tutorial/preparation.rst @@ -0,0 +1,14 @@ +============================================== +APIs that Help You Prepare Structures +============================================== + +Before performing simulation, we should have structures well-preparaed. +For example, we need to remove solvents and add hydrogen atoms to the structure complex. + +The following APIs can help you prepare your structures. + +- `Remove solvents `_ +- `Remove hydrogen atoms `_ +- `Protonate structure `_ + +Author: Zhong, Yinjie diff --git a/docs/source/sci_api_tutorial/single_point.rst b/docs/source/sci_api_tutorial/single_point.rst index 937bd5a..36c75ea 100644 --- a/docs/source/sci_api_tutorial/single_point.rst +++ b/docs/source/sci_api_tutorial/single_point.rst @@ -108,16 +108,16 @@ Arguments .. dropdown:: :fa:`eye,mr-1` Click to see full argument explanations ``stru`` - the target molecule of the calculation represented as Structure() + The target molecule of the calculation represented as Structure() It can also be an ensemble of structures as StructureEnsemble() and in this case, each geometry in this ensemble will be calculated. (See `Input/Output <#input-output>`_ section) ``engine`` - the QM or QM/MM engine as a keyword. (See `Input/Output <#input-output>`_ section) + The QM or QM/MM engine as a keyword. (See `Input/Output <#input-output>`_ section) ``method`` - the level of theory of this calculation as a LevelOfTheory(). + The level of theory of this calculation as a LevelOfTheory(). This is used when there is only 1 region specified. (See `Input/Output <#input-output>`_ section) ``regions`` @@ -136,20 +136,20 @@ Arguments (See `Input/Output <#input-output>`_ section) ``capping_method`` - | the free valence capping method. (See `Capping Methods `_) - | default: ``"res_ter_cap"`` + The free valence capping method. (See `Capping Methods `_) + default: ``"res_ter_cap"`` ``embedding_method`` - | The embedding method of multiscale simulation. + The embedding method of multiscale simulation. This is used when more than 1 region is specified. Supported keywords: ["mechanical"] - | default: ``"mechanical"`` + default: ``"mechanical"`` ``parallel_method`` - | the method to parallelize the multiple runs when more + The method to parallelize the multiple runs when more than 1 geometry is in the input StructureEnsemble The execution will serial and locally if None is given. - | default: ``"cluster_job"`` + default: ``"cluster_job"`` ``cluster_job_config`` the config for cluster_job if it is used as the parallel method.