Skip to content

Feature: Add export_ppl_to_yaml / export_ppl_to_json utility #19

Description

@Rudra-clrscr

Summary

Currently, all pipeline (PPL) configurations are stored internally in an SQLite database. While this works well for querying and tracking, it makes it difficult for researchers to inspect, version-control, or share a specific pipeline configuration without writing custom query code.

Proposed API

from plf.experiment import export_ppl_to_yaml, export_ppl_to_json
 
# Export a single PPL config to YAML
export_ppl_to_yaml("ppl_data_run_001", output_path="./configs/ppl_data_run_001.yaml")
 
# Export to JSON
export_ppl_to_json("ppl_data_run_001", output_path="./configs/ppl_data_run_001.json")
 
# Optionally, export all active PPLs at once
export_ppl_to_yaml("*", output_dir="./configs/")

Expected output format (YAML example)

pplid: ppl_data_run_001
status: frozen
created_at: "2025-11-02T14:30:00"
workflow:
  loc: my_workflows.GenericDataWorkflow
  args: {}
args:
  data_source:
    loc: my_workflows.MyComputationalComponent
    args:
      initial_value: 42
  algorithm:
    loc: my_workflows.MyComputationalComponent
    args:
      param_b: 5

Acceptance criteria

  • export_ppl_to_yaml(pplid, output_path) and export_ppl_to_json(pplid, output_path) functions added to plf.experiment
  • Output includes pplid, status, created_at, and the full nested args config
  • Raises a clear error if the pplid does not exist
  • Unit test added for both export formats
  • Functions documented in the README under Experiment Management Tools

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions