Skip to content

Add parallelization #24

@noTban

Description

@noTban

Different types of parallelization could be added :

  • parallelization of operations within the steps (operation parallelism). We could try to parralelize some steps #16
  • run_batch parallelization (Batch parallelism). For the moment, the pipeline takes one folder, and runs the specified steps with its data. The CLI can also run on batches of folder, but for the moment this is purely iterative. Since steps can be executed one by one, it should be possible to run the corresponding steps of each pipeline in parallel, and take advantage of the GPU when needed, for example for the deep models execution, which would run on batches of inputs.
  • parallelization of non dependent steps (DAG parallelism). This is not urgent, since the pipeline is mostly linear for the moment. Add automatic step parallelization #21

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions