Skip to content

Route stage I/O through Gremlin, drop direct State imports#1088

Closed
xbrianh wants to merge 15 commits into
mainfrom
issue-1075-route-through-gremlin-iter7
Closed

Route stage I/O through Gremlin, drop direct State imports#1088
xbrianh wants to merge 15 commits into
mainfrom
issue-1075-route-through-gremlin-iter7

Conversation

@xbrianh
Copy link
Copy Markdown
Owner

@xbrianh xbrianh commented Jun 3, 2026

Summary

Makes State internal to Gremlin by updating all stage signatures to accept gremlin instead of state, and eliminating direct state.py imports from external callers.

  • Stages: Drop State import; pass gremlin explicitly via stage.run(gremlin) -> Outcome
  • Other callers: Route through Gremlin interface instead of calling build_state, validate_gremlin_id, etc. directly
  • State ownership: State becomes gremlin.state, fully owned and managed by Gremlin

Closes #1075

xbrianh and others added 10 commits June 3, 2026 10:51
- Change Stage.run(state) to Stage.run(gremlin)
- Stages access gremlin.state for runtime context
- Update StageProtocol and GremlinProtocol
- Pass gremlin through make_runner to all stage instances
- Update run_child and spawn/child to wrap state in GremlinWrapper

Stages now receive gremlin instead of state, making state fully
managed by Gremlin as the framework's execution context manager.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Move validate_gremlin_id, resolve_state_file, resolve_artifact_dir,
  landable_shape functions to new state_utils.py module
- Update all imports to use state_utils instead of state for these functions
- Add GremlinWrapper classes in run_child, spawn/child, and land for
  compatibility with new stage signature
- State becomes more clearly internal, utility functions available separately

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Remove dead fallback in make_runner: gremlin is now a required parameter instead of optional, eliminating the unreachable path that passed State where GremlinProtocol is expected
- Remove empty if TYPE_CHECKING blocks from agent.py and loop.py that served no purpose
- Extract shared _GremlinWrapper class to executor/state.py, replacing duplicates in run_child.py and spawn/child.py
- Add fork stub to _LandGremlinWrapper in fleet/land.py for protocol conformance
- Remove unnecessary module docstring from state_utils.py
- Update test stages and tests to use new GremlinProtocol signature and pass gremlin wrapper instances

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Add state and registry properties to State class to implement GremlinProtocol,
allowing State objects to be passed directly to stage.run() in tests.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@xbrianh xbrianh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structural direction is right — stages now talk to gremlin rather than state directly, and the utility split into state_utils.py is clean. Four issues worth addressing: a duplicate wrapper class in land.py that also has a wrong fork signature, a missing type annotation in make_runner, a silent no-op setter on State that could silently discard state updates, and three test files that define their own _GremlinWrapper instead of using the shared one.

Comment thread gremlins/fleet/land.py Outdated
self.state = st
self.registry = getattr(st, "artifacts", None)

async def fork(self) -> None:
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_LandGremlinWrapper duplicates GremlinWrapper from state.py, which already does the same thing and has the correct fork signature. The local fork(self) -> None takes no positional args, so if it were ever called it'd raise TypeError (wrong number of args) rather than NotImplementedError.

Replace with:

from gremlins.executor.state import GremlinWrapper
...
wrapper = GremlinWrapper(state)

Also: getattr(st, "artifacts", None) in __init__ is defensive coding at an internal boundary — state is always a State here; just use st.artifacts.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: removed _LandGremlinWrapper and now using GremlinWrapper directly. Changed defensive getattr to direct st.artifacts access.

Comment thread gremlins/executor/state.py Outdated
scope: Sequence[StageProtocol] | None = None,
*,
record_stage: bool = True,
gremlin: Any,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gremlin: Any should be gremlin: GremlinProtocol. The whole point of the protocol is type safety at this boundary; Any defeats it. The import is already present via TYPE_CHECKING.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: changed gremlin parameter to GremlinProtocol for type safety.

Comment thread gremlins/executor/state.py Outdated
def state(self) -> State:
return self

@state.setter
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The no-op setter silently swallows gremlin.state = prepared_state when gremlin is a State. That means any caller that accidentally passes a raw State as gremlin gets the original state, not the prepared one — a silent correctness bug with no error. If State is not intended to be used as a gremlin argument, remove this shim and let the type error surface explicitly.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: removed the no-op state property setter from State class.

from gremlins.stages.parallel import ParallelStage
from gremlins.stages.sequence import SequenceStage


Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private _GremlinWrapper here and in test_orchestrator_boss.py and test_stage_loop.py all duplicate GremlinWrapper from gremlins.executor.state. test_parallel_runner.py already imports it directly. Replace all three with from gremlins.executor.state import GremlinWrapper.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed: replaced _GremlinWrapper classes in test_active_children.py, test_stage_loop.py, and test_orchestrator_boss.py with imports of GremlinWrapper from gremlins.executor.state.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the execution boundary so stages are invoked with a Gremlin-shaped object (GremlinProtocol) rather than receiving State directly, and it moves state-related utilities (e.g., ID validation and path resolution) out of executor.state to help make State an internal implementation detail.

Changes:

  • Update stage run(...) signatures and internal runner plumbing to pass a gremlin object (with gremlin.state) instead of a state parameter.
  • Move validate_gremlin_id, resolve_state_file, resolve_artifact_dir, and landable_shape into a new gremlins.executor.state_utils module and update call sites.
  • Update test and subprocess/CLI entrypoints to construct/pass Gremlin-like wrappers where needed.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/test_stage_loop.py Wraps runtime State in a minimal Gremlin-like object for LoopStage tests.
tests/test_skip_if_exists.py Updates a test stage to run(gremlin) and passes a wrapper into make_runner(...).
tests/test_parallel_runner.py Updates local test stages to run(gremlin) and uses GremlinWrapper when calling make_runner(...) / running stages.
tests/test_orchestrator_boss.py Adjusts loop/orchestrator tests to pass a Gremlin-like wrapper to loop.run(...).
tests/test_active_children.py Switches sequence/loop tests to pass a GremlinProtocol wrapper and updates test stage signatures.
gremlins/stages/sequence.py Changes SequenceStage to run(gremlin) and threads gremlin through child runner creation.
gremlins/stages/parallel.py Changes ParallelStage to run(gremlin) and threads gremlin into child runner creation.
gremlins/stages/loop.py Changes LoopStage to run(gremlin) and threads gremlin into runner building.
gremlins/stages/exec.py Changes Exec stage to run(gremlin) and uses gremlin for substitutions.
gremlins/stages/composite.py Renames child_state parameter for clarity; still builds derived child states.
gremlins/stages/base.py Updates Stage base run(...) signature and routes substitutions through gremlin.state.
gremlins/stages/agent.py Changes Agent stage to run(gremlin) and routes substitutions through gremlin.
gremlins/spawn/pipeline.py Switches gremlin-id validation import to executor.state_utils.
gremlins/spawn/child.py Uses GremlinWrapper to call stage.run(gremlin); switches validation import to state_utils.
gremlins/run_child.py Uses GremlinWrapper to call stage.run(gremlin); switches validation import to state_utils.
gremlins/protocols.py Expands GremlinProtocol to include state; updates StageProtocol to run(gremlin).
gremlins/launcher.py Switches gremlin-id validation import to executor.state_utils.
gremlins/fleet/land.py Switches land helpers to state_utils and wraps state to call land_stage.run(gremlin).
gremlins/executor/state.py Removes several free functions (moved to state_utils), adds GremlinProtocol-like helpers, and updates make_runner to accept a gremlin object.
gremlins/executor/state_utils.py New module housing gremlin-id validation and state/artifact path helpers (plus landable_shape).
gremlins/executor/run.py Switches artifact/state path imports to executor.state_utils.
gremlins/executor/parallel_state.py Switches state file resolution import to executor.state_utils.
gremlins/executor/gremlin.py Adds Gremlin.state field and passes self into State.make_runner(..., gremlin=self).
gremlins/cli/resume.py Switches gremlin-id validation import to executor.state_utils.
Comments suppressed due to low confidence (1)

gremlins/stages/loop.py:23

  • This stage module still imports State from gremlins.executor.state at runtime. The PR/issue goal is to make gremlins.executor.state an internal implementation detail and to eliminate direct imports from stage implementations. In this file, State is only needed for typing (UntilFn, helper signatures). Consider switching these to a protocol/Any, or moving the State import under TYPE_CHECKING and updating runtime type aliases to not require State at import time.
from gremlins.artifacts.registry import ArtifactRegistry
from gremlins.executor.state import State
from gremlins.protocols import GremlinProtocol
from gremlins.stages.base import Stage, get_client_from_dict
from gremlins.stages.composite import child_state as _child_state
from gremlins.stages.outcome import Bail, Done, Outcome
from gremlins.utils import git as _git

logger = logging.getLogger(__name__)

# Called after a clean (no marker) iteration; returns True to exit the loop.
UntilFn = Callable[[State, int, str], bool]


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 551 to +558
async def _run_async() -> Any:
if entry.skip_if_exists and base_state.artifacts.verified(
entry.skip_if_exists
):
return Done()
return await entry.run(_prepare())
prepared_state = _prepare()
gremlin.state = prepared_state
return await entry.run(gremlin)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed as #1089 — this is a data race in State.make_runner that requires careful design to fix. Multiple concurrent children racing on gremlin.state mutation is a real issue that deserves its own PR.

Comment thread gremlins/fleet/land.py
Comment on lines 873 to +895
from gremlins.clients.client import PACKAGE_DEFAULT
from gremlins.executor.state import StateData, build_state
from gremlins.stages.outcome import Bail

state = build_state(
data=StateData(),
client=PACKAGE_DEFAULT,
artifact_dir=artifact_dir,
cwd=cwd,
artifacts=registry,
)

class _LandGremlinWrapper:
def __init__(self, st: Any) -> None:
self.state = st
self.registry = getattr(st, "artifacts", None)

async def fork(self) -> None:
raise NotImplementedError("fork not supported in land context")

wrapper = _LandGremlinWrapper(state)
try:
asyncio.run(land_stage.run(state))
asyncio.run(land_stage.run(wrapper))
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in same commit as 3350552438 — now using GremlinWrapper directly instead of _LandGremlinWrapper.

Comment thread gremlins/stages/exec.py
Comment on lines 14 to 18
from gremlins.artifacts.schemes import snapshot_head_before
from gremlins.artifacts.uri import Uri
from gremlins.executor.state import State
from gremlins.protocols import GremlinProtocol
from gremlins.stages.base import Stage
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is part of the larger refactoring to eliminate State imports from stages. Moving FRAMEWORK_KEYS to a non-state module requires coordinated updates across exec.py, agent.py, parallel.py, and composite.py. Recommend filing a follow-up issue to track this as a single refactoring task.

Comment thread gremlins/stages/agent.py
Comment on lines 7 to 13
from gremlins.artifacts.resolve import resolve_in_map
from gremlins.artifacts.uri import Uri
from gremlins.executor.state import State
from gremlins.protocols import GremlinProtocol
from gremlins.stages.agent_runner import run_agent
from gremlins.stages.base import Stage, get_client_from_dict
from gremlins.stages.outcome import Bail, Done, Outcome
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #3350576190 — part of larger refactoring to eliminate State imports from stage modules.

Comment on lines 20 to 27
from gremlins import paths
from gremlins.artifacts.uri import Uri
from gremlins.executor.parallel_state import ParallelGroupState
from gremlins.executor.state import State, StateData
from gremlins.protocols import GremlinProtocol
from gremlins.stages.base import Stage
from gremlins.stages.composite import child_state as _child_state
from gremlins.stages.outcome import Bail, Done, Outcome
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires careful refactoring to move State/StateData usage behind TYPE_CHECKING or replace with protocols. This is part of the larger stage-import refactoring goal.

Comment on lines 7 to 11
from gremlins import paths as _paths
from gremlins.clients.client import PACKAGE_DEFAULT
from gremlins.executor.state import State
from gremlins.stages.base import Stage

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The child_state() helper in composite.py is a key blocker for eliminating State imports from stages. This likely needs to be moved to the executor layer or refactored to use protocols. Recommend tracking as a separate refactoring issue.

xbrianh and others added 5 commits June 3, 2026 11:29
… wrappers

- Change gremlin: Any to gremlin: GremlinProtocol in State.make_runner for type safety
- Remove no-op state property setter from State class that masks correctness bugs
- Replace _LandGremlinWrapper with import of GremlinWrapper in land.py
- Replace local _GremlinWrapper classes in test files with import from gremlins.executor.state
- Remove unused State import from test_active_children.py

Fixes address review comments from #1088.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Update all stage classes to handle both State and GremlinProtocol arguments. Previously, code tried to access gremlin.state unconditionally, but tests pass State objects directly. Now check if gremlin is a State instance first.

Update test mocks to provide required artifacts attribute when build_state is mocked.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@xbrianh xbrianh closed this Jun 3, 2026
@xbrianh xbrianh deleted the issue-1075-route-through-gremlin-iter7 branch June 3, 2026 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove direct state.py imports — route through Gremlin instances

2 participants