Skip to content

Feature: Hierarchical / child tasks #8

@deepjoy

Description

@deepjoy

Summary

Support parent-child task relationships where a parent task can spawn child sub-tasks, and the parent's lifecycle is tied to the completion of its children.

Motivation

A common pattern in file transfer systems is multipart uploads: a single file transfer (parent) is broken into N parts (children) that execute concurrently. Today TaskMill's task model is flat — every task is independent. This forces users to manage sub-task concurrency and lifecycle outside of TaskMill, losing the benefits of its scheduling, persistence, and backpressure.

Proposed Behavior

  • A TaskSubmission can include child tasks, or a running TaskExecutor can dynamically spawn children via TaskContext
  • Children inherit the parent's priority by default but can override it
  • The parent task auto-transitions to a completed/failed state based on children:
    • All children succeed → parent succeeds
    • Any child fails (after retries) → parent fails, remaining children are cancelled
  • Children can be individually cancelled without affecting siblings
  • TaskStore persists the parent-child relationship so that after a restart, resumed children are correctly associated with their parent
  • SchedulerEvent emits child-level progress that can be aggregated to parent-level progress (e.g. "part 7/20 complete" rolls up to "file 35% transferred")

Example Use Case

// Parent: transfer a 10GB file
let parent = TaskSubmission::new("transfer", "upload:s3://bucket/large-file.tar.gz")
    .with_priority(Priority::NORMAL)
    .with_expected_io_bytes(10_737_418_240);

// Children: 10 x 1GB parts, spawned dynamically by the parent executor
// during execution based on the multipart upload plan

Design Considerations

  • Should child tasks share the parent's concurrency slot, or consume their own? (Likely their own, with a per-parent child concurrency limit)
  • Cancelling a parent should cancel all pending/running children and invoke their abort hooks
  • TaskStore schema needs a parent_id column on the tasks table
  • History/audit trail should link children to their parent for debugging

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions