input materialization over-matches references, creating warning storms and irrelevant materialization

## Problem
Input materialization appears to over-classify arbitrary tokens as path/glob references, producing high warning volume and broad materialization of irrelevant files.

In run `01KJR25QS6RY52D7VS2ZRAAXMK`, this behavior is stable across all stages rather than isolated to one node.

## Why this matters
- Warning noise hides actionable diagnostics.
- Over-materialized inputs inflate context and can degrade convergence quality.
- The system spends time on parsing/expanding invalid candidates that are clearly not file references.
- Irrelevant artifacts in materialized inputs increase nondeterminism.

## Evidence
Run artifacts:
- `~/.local/state/kilroy/attractor/runs/01KJR25QS6RY52D7VS2ZRAAXMK/progress.ndjson`
- Stage manifests under `.../01KJR25QS6RY52D7VS2ZRAAXMK/*/inputs_manifest.json`

Observed counts from this run:
- `input_materialization_warning` events: `1397`
- Per-stage manifests consistently show:
  - `warnings=127`
  - `resolved_files≈2308`
  - `discovered_references=3388`

Representative warnings include clearly non-path tokens:
- `expand input glob "you are wielding ([ch]) [weapon": syntax error in pattern`
- `expand input glob "DEFAULT_TOOL_LIMITS[tool_name": syntax error in pattern`

Materialized file sets include irrelevant artifact paths such as:
- `.worktrees/rogue-logs/worktree/.cargo-target/...`

Relevant code to inspect:
- `internal/attractor/engine/input_reference_scan.go`
- `internal/attractor/engine/input_materialization.go`

Likely hotspots:
- `deterministicInputReferenceScanner` token capture breadth
- glob classification for tokens containing `*?[`
- artifact-path filtering (`isLikelyArtifactInputPath`) currently not excluding `.worktrees`

## Steps to reproduce / observe
1. Inspect run `01KJR25QS6RY52D7VS2ZRAAXMK` and count `input_materialization_warning` in `progress.ndjson`.
2. Compare `warnings/resolved_files/discovered_references` across all stage manifests.
3. Sample warning payloads and verify many candidates are non-reference text fragments.
4. Inspect `resolved_files` for `.worktrees/**` and build-cache artifacts.

## Scope boundaries
This issue is about reference scanning/classification/materialization precision.

This issue is not:
- A provider/model routing issue.
- A one-off suppression of warnings without improving classification quality.
- A project-specific workaround.

## Potential directions (non-prescriptive)
- Tighten scanner heuristics for structured/unstructured token extraction.
- Require stronger validity checks before classifying a token as a glob.
- Expand default artifact exclusions to include `.worktrees/**`.
- Add counters separating candidate extraction, rejected candidates, failed expansions, and accepted references.
- Add regression tests for known bad-token patterns and artifact contamination.

## Definition of done
- Reproducing this run shape no longer yields warning storms from non-reference tokens.
- `.worktrees/**` artifacts are not materialized by default.
- Warning volume is materially reduced without dropping legitimate reference discovery.
- Regression tests cover bad-token and artifact-path cases.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input materialization over-matches references, creating warning storms and irrelevant materialization #48

Problem

Why this matters

Evidence

Steps to reproduce / observe

Scope boundaries

Potential directions (non-prescriptive)

Definition of done

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

input materialization over-matches references, creating warning storms and irrelevant materialization #48

Description

Problem

Why this matters

Evidence

Steps to reproduce / observe

Scope boundaries

Potential directions (non-prescriptive)

Definition of done

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions