Add input sources mechanism for pipeline artifact seeding#1082
Conversation
- Define InputSource and InputSources classes to declare external artifact sources - Parse sources: block from inputs: section in pipeline YAML - Update Pipeline to store input_sources - Register instructions artifact in artifact directory - Add _seed_registry_from_sources function to drive pre-seeding from sources - Update launcher to use sources for registry pre-seeding with fallback to legacy hardcoding - Update gh-terse.yaml to declare issue, plan, and instructions sources
- Test legacy fallback when no input_sources present - Test string sources for issue, plan, and instructions - Test optional vs required source handling - Test union types (filepath + string) - Test error cases for missing required sources
- Resolve values based on source key (plan, issue, instructions) - Avoid mixing plan argument with issue/plan source semantics - Issue only uses plan arg if it's not a file (i.e., issue reference) - Improves testability and semantic correctness
Removed unused pathlib import and backward-compat legacy path. Trimmed oversized docstrings to one-line max. Moved empty-types validation to top of __post_init__. Deleted dead methods all_sources() and required_sources(). Eliminated what-narration comments throughout _seed_registry_from_sources. Changed exception handling to only catch file-not-found errors, letting validation errors propagate. Added validation in InputSources.from_yaml to restrict allowed source keys at parse time. Threaded loaded_pipeline through _Inputs to eliminate duplicate pipeline loads. Addresses code review and complexity review findings. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
xbrianh
left a comment
There was a problem hiding this comment.
Overall the design is clean — decoupling source declarations from launcher logic is the right move. Four issues worth addressing: a weak type annotation on _Inputs, a re-export in pipeline/__init__.py that violates the project style, and two redundant file writes introduced alongside the new seeding logic.
There was a problem hiding this comment.
Pull request overview
Adds a first-class “input sources” declaration under inputs: in pipeline YAML so the launcher can pre-seed the artifact registry from externally-provided inputs (e.g., plan/issue/instructions) in a structured, validated way, instead of relying on ad-hoc hardcoded seeding.
Changes:
- Introduces
InputSource/InputSourcesparsing and attaches parsedinputs.sourcesontoPipeline. - Updates the launcher to seed the artifact registry based on declared
input_sources, with legacy fallback behavior. - Updates
gh-terse.yamlto declare sources and adds focused test coverage for parsing + launcher seeding.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
gremlins/pipeline/inputs.py |
New schema/validation and YAML parsing for inputs.sources. |
gremlins/pipeline/__init__.py |
Parses inputs.sources during pipeline load and stores it on Pipeline. |
gremlins/launcher.py |
Adds registry seeding logic driven by declared input sources and adjusts state dir preparation. |
gremlins/pipelines/gh-terse.yaml |
Declares issue/plan/instructions sources in pipeline YAML. |
tests/test_input_sources.py |
Unit tests for parsing/validation and pipeline integration. |
tests/test_launcher_input_sources.py |
Tests launcher registry seeding behavior (including legacy fallback and union types). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Fix type annotation on loaded_pipeline to _PipelineData | None (not Any) - Move InputSources import under TYPE_CHECKING to avoid re-export - Remove redundant instructions.txt write in _prepare_state_dir - Remove redundant plan-arg.txt write in legacy fallback path - Always seed plan_arg when --plan is provided, even with input_sources - Make issue source optional in gh-terse.yaml for file-backed plans - Validate optional field is actually boolean in InputSources.from_yaml - Allow arbitrary source keys in InputSources.from_yaml for extensibility
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
gremlins/run_child.py:124
_build_state()no longer passes aninstructionsvalue intobuild_state(), but the file’s top-level docstring still lists an"instructions"field in the spec JSON schema. That’s now inaccurate and can confuse callers producing spec files.
Update the documented spec schema to match the actual fields consumed.
return build_state(
data=data,
client=client,
artifact_dir=artifact_dir,
pipeline_data=pipeline_data,
repo=str(spec.get("repo") or ""),
child_key=spec.get("child_key") or None,
parent_stage=str(spec.get("parent_stage") or ""),
worktree=worktree,
worktree_parent=worktree_parent,
)
| - name: resolve-plan-input | ||
| type: exec | ||
| in: | ||
| plan_arg: "plan_arg?" | ||
| plan: "plan?" | ||
| options: |
| - name: resolve-plan-input | ||
| type: exec | ||
| in: | ||
| plan_arg: "plan_arg?" | ||
| plan: "plan?" | ||
| options: |
| def _seed_registry_from_sources( | ||
| registry: ArtifactRegistry, | ||
| input_values: dict[str, str], | ||
| sources: dict[str, Any], | ||
| artifacts_dir: pathlib.Path, | ||
| ) -> None: | ||
| for key, source in sources.items(): | ||
| value = input_values.get(key) or None | ||
| if not value: | ||
| if not source.optional: | ||
| raise ValueError( | ||
| f"required input source {key!r} (type: {source.types}) is not available" | ||
| ) | ||
| continue | ||
| for t in source.types: | ||
| if t == "filepath" and os.path.isfile(value): | ||
| registry.bind(key, Uri.parse(f"file://{value}")) | ||
| break | ||
| elif t == "string": | ||
| dest = artifacts_dir / f"{key}.txt" | ||
| dest.write_text(value, encoding="utf-8") | ||
| registry.bind(key, Uri.parse(f"file://session/{key}.txt")) | ||
| break |
| in: | ||
| INSTRUCTIONS: instructions? | ||
| PLAN: plan? | ||
| PR: pr? | ||
| sources: | ||
| plan: | ||
| type: [filepath, string] | ||
| optional: true | ||
| instructions: | ||
| type: string | ||
| optional: true |
| in: | ||
| INSTRUCTIONS: instructions? | ||
| PLAN: plan? | ||
| PR: pr? | ||
| sources: | ||
| plan: | ||
| type: [filepath, string] | ||
| optional: true | ||
| instructions: | ||
| type: string | ||
| optional: true |
| for t in source.types: | ||
| if t == "filepath" and os.path.isfile(value): | ||
| registry.bind(key, Uri.parse(f"file://{value}")) | ||
| break |
| return build_state( | ||
| data=data, | ||
| client=client, | ||
| artifact_dir=artifact_dir, | ||
| pipeline_data=pipeline_data, | ||
| child_key=spec.get("child_key") or None, | ||
| parent_stage=str(spec.get("parent_stage") or ""), | ||
| worktree=worktree, | ||
| worktree_parent=worktree_parent, | ||
| repo=str(spec.get("repo") or ""), | ||
| instructions=str(spec.get("instructions") or ""), | ||
| base_ref=str(spec.get("base_ref") or ""), | ||
| ) |
…seeding - plan-document is the resolved markdown artifact; plan remains the raw input source binding (--plan CLI arg), eliminating the DuplicateArtifact collision that prevented recipes from rebinding the resolved content - filepath input sources now copy to session artifacts instead of binding an external file:// URI, so resolve-plan-input's early-exit check fires correctly and the plan-document binding succeeds Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ant cases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Introduces an input sources mechanism that allows pipelines to declare external artifact sources in their YAML configuration. This enables systematic handling of external inputs (plan files, issue references, instructions) rather than hardcoding registry pre-seeding logic.
Changes
_seed_registry_from_sourcesfunction drives artifact registration based on declared sources, with legacy fallback for pipelines without input_sources.Design
The mechanism separates artifact declaration (pipeline YAML) from artifact loading (launcher), allowing:
Closes #1053