Skip to content

Data race in State.make_runner with concurrent child execution #1089

@xbrianh

Description

@xbrianh

Summary

State.make_runner mutates the shared gremlin instance via gremlin.state = prepared_state right before entry.run(gremlin). In ParallelStage, multiple child runners execute concurrently (asyncio.gather), so this shared mutation can race and cause a stage to observe another child's state. This can lead to incorrect artifact_dir/worktree/cwd substitutions and wrong artifact bindings.

Reproduction

In ParallelStage with multiple concurrent children, each child runner is created via make_runner(entry, gremlin=gremlin), returning an async function that captures the same gremlin object. When the runners execute concurrently in asyncio.gather(), they race on gremlin.state = prepared_state. Child A can set its state, but before calling entry.run(gremlin), Child B sets the state, causing stages that read gremlin.state (e.g., in Stage.substitute_vars()) to see the wrong state.

Reviewer comment

State.make_runner mutates the shared gremlin instance via gremlin.state = prepared_state right before entry.run(gremlin). In ParallelStage, multiple child runners execute concurrently (asyncio.gather), so this shared mutation can race and cause a stage to observe another child's state (especially via Stage.substitute_vars(), which reads gremlin.state at call time). This can lead to incorrect artifact_dir/worktree/cwd substitutions and wrong artifact bindings.

Ref #1088
#1088 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions