Skip to content

Add more type hints: Literal and TypedDict #240

@conjuncts

Description

@conjuncts

Python >3.8's typing module greatly improves user readability of function parameters from a quick glance. I want to highlight two features, Literal and TypedDict, that would greatly improve how the user discovers function parameters.

Literal

Literal lets the user understand enums, especially sentinel values, at a quick glance.

A good example is AmberInterface.build_md_parametrizer. This is the current method signature:

def build_md_parameterizer(
    self,
    force_fields: List[str] = "default",
    charge_method: str = "default",
    # ...
) -> AmberParameterizer:

Many of these parameters are enums, and "default" is a frequent sentinel value. However, it is not clear which other values are acceptable, and it requires code inspection.

Instead of type hinting with str, this is precisely the use of Literal and Union:

def build_md_parameterizer(
    self,
    force_fields: List[Literal["leaprc.protein.ff14SB", "leaprc.gaff", "leaprc.water.tip3p"]] = "default",
    charge_method: Literal["AM1BCC", "bcc", "RESP", "resp", "rc", "default"] = "default",
    # ...
) -> AmberParameterizer:

Another example enhancement is build_md_step:

def build_md_step(self,
    # ...
    thermostat: Literal["constant_energy", "berendsen", "anderson", "langevin", "oin", "sin-respa", "bussi"] = "default",
    # ...
) -> AmberMDStep:

This information is currently reported in the code, but it requires user inspection: ie. reading SUPPORTED_CHARGE_METHOD_MAPPER or the docstring. Meanwhile, using Literal allows static analysis tools to automatically detect enum values.

TypedDict

TypedDict provides type hints for dictionaries, and helps understand which parameters are and aren't allowed. The main benefit for TypedDict would be the cluster_job_config object. cluster_job_config is currently type-hinted as a simple dict, but using a TypedDict would let the user understand which parameters can be changed without the need for deep code inspection.

As an example, the following TypedDict would provide hints for the user on which parameters are permitted in cluster_job_config:

class ClusterJobResKeywords(TypedDict):
    core_type: Literal["gpu", "cpu"]
    nodes: str
    node_cores: str
    job_name: str
    partition: str
    mem_per_core: str
    walltime: str
    account: str
    qos: NotRequired[str]
class ClusterJobConfigDict(TypedDict):
    """The old dict type for cluster_job_config"""
    cluster: ClusterInterface
    res_keywords: ClusterJobResKeywords

Then,

def equi_md_sampling(stru: Structure,
    # ...
    cpu_equi_job_config: ClusterJobConfigDict= None,
    # ...
) -> List[StructureEnsemble]:

Another location this could be useful is AmberMDStep.md_config_dict() and other configuration dicts.

Discussion

These two type hints would greatly enhance the user's ability to understand valid parameter values at a quick glance. It helps the user deviate from template scripts.

Because this suggestion involves mostly pattern-matching, AI could be effective in implementing this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions