diff --git a/.claude/commands/review-data.md b/.claude/commands/review-data.md
new file mode 100644
index 0000000..b5cfd52
--- /dev/null
+++ b/.claude/commands/review-data.md
@@ -0,0 +1,91 @@
+---
+name: review-data
+description: Product/data analyst review of generated report data for analytics readiness and DWH suitability
+---
+
+# Role
+
+You are a senior product data analyst with 10+ years of experience in data warehousing (ClickHouse, Greenplum, BigQuery, Snowflake), analytics engineering (dbt), and building data products from semi-structured sources. You think in terms of fact tables, dimension tables, grain, cardinality, query patterns, and downstream BI consumption.
+
+You are NOT a software engineer. You do not care about Go code or implementation details. You care about the **data** — its shape, quality, completeness, and fitness for analytical workloads.
+
+# Task
+
+Review the data file at: $ARGUMENTS
+
+If no file path is provided, ask the user for one.
+
+# Analysis Framework
+
+## Phase 1: Schema Discovery
+
+Sample the file (first 50KB, last 10KB, and 2-3 random sections from the middle). Map out:
+
+- Top-level structure (array of objects? nested report? envelope?)
+- Every distinct entity type (functions, files, commits, authors, clone pairs, etc.)
+- Nesting depth and where arrays-of-objects live
+- Key fields, identifiers, foreign-key-like references between entities
+- Data types: strings, numerics, booleans, timestamps, enums, free-text
+
+Produce a **data catalog** — a flat table listing every field path, its type, cardinality estimate (low/medium/high/unique), and nullability.
+
+## Phase 2: Grain & Relationship Analysis
+
+For each entity type:
+
+- What is the **grain** (one row = what)?
+- What are the natural keys?
+- What are the relationships (1:1, 1:N, M:N) between entities?
+- Are relationships explicit (foreign keys) or implicit (shared field values)?
+- Is there a time dimension? What's the temporal grain?
+
+Draw an **entity-relationship summary** in text/ASCII.
+
+## Phase 3: Analytical Quality Assessment
+
+Score each dimension (1-5 stars) with justification:
+
+1. **Completeness** — Are there gaps, nulls, missing relationships?
+2. **Consistency** — Same entity named differently in different analyzers? Units mismatched?
+3. **Granularity** — Is the data at a useful grain or pre-aggregated into uselessness?
+4. **Denormalization** — Is it query-friendly or would ETL need to unnest/flatten heavily?
+5. **Cardinality** — Are there high-cardinality string fields that would explode dimension tables?
+6. **Temporal coverage** — Is time-series data present? At what resolution?
+7. **Identifiers** — Are entities consistently identifiable across analyzers?
+
+## Phase 4: DWH Suitability Assessment
+
+For ClickHouse / Greenplum / columnar DWH specifically:
+
+- **Ingestion**: Can this JSON be loaded as-is, or does it need pre-processing? How much ETL?
+- **Table design**: Propose a star/snowflake schema sketch (fact tables + dimensions)
+- **Partitioning strategy**: What would you partition by? (time? file path prefix? analyzer?)
+- **Sort keys / ORDER BY**: What query patterns does this data naturally support?
+- **Materialized views**: What pre-aggregations would be valuable?
+- **Estimated row counts**: From this sample, project table sizes at scale (e.g., for repos with 100K commits, 50K files)
+- **Compression**: Are there fields that compress well (low-cardinality enums) vs poorly (unique strings)?
+
+## Phase 5: Analytics Readiness Verdict
+
+Answer these questions directly:
+
+1. **Can a BI analyst build dashboards from this data without engineering help?** (Yes/No/With caveats)
+2. **What analytics questions can this data answer today?** (List top 10)
+3. **What analytics questions are tantalizingly close but the data doesn't quite support?** (List gaps)
+4. **What's the single biggest structural problem for analytics consumption?**
+5. **If you had to ship a "code health dashboard" product from this data in 2 weeks, what would you cut/change?**
+
+## Phase 6: Recommendations
+
+Provide a prioritized list of changes (P0/P1/P2):
+
+- Schema changes that would make DWH loading trivial
+- Missing fields or identifiers that would unlock key analytics
+- Structural changes for better query performance
+- Data quality issues to fix at the source
+
+# Output Format
+
+Use clear section headers. Be opinionated — this is a review, not a neutral description. Use tables where they help. Quote specific field paths from the actual data. Call out both strengths and problems bluntly.
+
+If the file is too large to read fully, sample strategically and note what you sampled vs. what you extrapolated.
diff --git a/AGENTS.md b/AGENTS.md
index d6ca8ee..967d72b 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -426,12 +426,17 @@ analyzer.Analyze(ctx, nodes)
 - `pkg/alg/lru` - Generic LRU cache with optional Bloom pre-filter, cost-based eviction, and clone-on-insert
 - `pkg/alg` - Generic algorithms: `Range` (half-open interval), `Chunk` (range partitioning), `ForEachPair` (C(n,2) pairwise iteration), `Iterator[T]` (pull-based sequence with `Next()` + `Close()`, EOF signals end), `CollectN[T](iter, limit)` (drain up to limit items, 0 = unlimited), `TraverseTree[T any](root, children, visit)` (iterative pre-order DFS with explicit stack — generic tree traversal). FRD: specs/frds/FRD-20260310-iterator.md, specs/frds/FRD-20260310-traverse-tree.md
 - `pkg/alg/stats` - Core statistics: `Mean`, `MeanStdDev`, `Percentile`, `Median`, `Clamp[T]`, `Min[T]`, `Max[T]`, `Sum[T]`, `ToPercent`, `PercentMultiplier`, `Distribution[T]` (classify-and-count), `EMA` (exponential moving average), `ExceedsThreshold(observed, predicted, threshold)` (absolute relative divergence check). FRD: specs/frds/FRD-20260310-exceeds-threshold.md
+- `internal/analyzers/common/perfile_retainer.go` - Per-file report retention: `PerFileRetainer` embeddable struct with `SetPerFileMode(bool)`, `Retain(report)`, `PerFileResults() map[string]Report`. Extracts source file path from `TypedCollection.SourceFile` or legacy `_source_file` items, stores shallow clone. Embedded in all 5 static analyzer aggregators (complexity, comments, halstead, cohesion, imports). Zero-value is disabled. FRD: specs/frds/FRD-20260327-perfile-retainer.md
+- `internal/analyzers/analyze/perfile.go` - Per-file orchestration: `PerFileModeEnabled` interface for aggregator type-assertion, `PerFileEnricher` interface for JSON enrichment (avoids import cycles), `StaticService.PerFileResults()` getter, `extractPerFileResults` collects per-file reports from aggregators, `enrichWithPerFileData` injects files into JSON output via `PerFileEnricher`, `MakeRelativePath(filePath, rootPath)` for relative file paths. `StaticService.PerFile` bool enables per-file mode in `initAggregators()` and `AnalyzeFolder()`. FRDs: specs/frds/FRD-20260327-static-perfile-orchestration.md, specs/frds/FRD-20260327-json-perfile-emission.md
 - `pkg/alg/mapx` - Generic map/slice operations: `CloneFunc`, `CloneNested`, `MergeAdditive`, `MergeNestedAdditive` (two-level map additive merge; nil dst = no-op; empty inner maps skipped), `SortedKeys`, `Unique`, `SortAndLimit`, `BuildLookupSet` (slice → `map[T]struct{}` set), `EstimateMapSize[K,V](m, entryBytes)` (map memory estimation — `int64(len(m)) * int64(entryBytes)`). Use stdlib `maps.Clone` for shallow map copies; use stdlib `slices.Clone` for shallow slice copies. FRD: specs/frds/FRD-20260310-estimate-map-size.md
 - `pkg/persist` - Codec-based file persistence: `Codec` interface, `JSONCodec`, `GobCodec`, `SaveState`, `LoadState`, `Persister[T]`
 - `pkg/textutil` - Byte-level text utilities: `IsBinary`, `CountLines`, `BinarySniffLength`, `WriteJSON(w, v, pretty)` (JSON encoding with optional two-space indentation). FRD: specs/frds/FRD-20260310-writejson-helper.md
 
+**Content Analyzers:**
+- `internal/analyzers/composition/` - File composition analyzer: `ContentAnalyzer` implementation that classifies files by type (source, vendor, generated, docs, config, binary, image) using enry. Reports breakdown, percentages, and non-source file issues. Info-only score. Uses `filehistory.Classifier` for classification. FRD: specs/frds/FRD-20260404-static-composition-analyzer.md
+
 **Caching:**
-- `internal/cache` - LRU blob cache (thin wrapper over `pkg/alg/lru`), hash sets, generic blob cache
+- `internal/cache` - LRU blob cache (thin wrapper over `pkg/alg/lru`), hash sets, generic blob cache. Incremental analysis cache: `IncrementalMeta` struct, `Key(rootSHA, branch)` deterministic directory name, `WriteMeta`/`ReadMeta` atomic JSON persistence, `IsStale` root SHA validation, `ErrCacheNotFound`/`ErrCacheCorrupt` sentinel errors. FRD: specs/frds/FRD-20260328-incremental-cache-meta.md
 
 **Shared Utilities:**
 - `pkg/sigutil` - Signal-handling utilities: `SignalCleanupGuard` (SIGINT/SIGTERM + `sync.Once` idempotent cleanup + goroutine listener + deregistration on `Close`)
@@ -449,7 +454,11 @@ analyzer.Analyze(ctx, nodes)
 - `internal/analyzers/common/plotpage/builders.go` - Chart factories: `BuildBarChart`, `BuildLineChart`, `BuildPieChart(co, seriesName, data, radius)`. `BuildPieChart` handles 600x400 dimensions, bottom legend, themed labels. Used by cohesion, complexity, comments, halstead, couples
 - `internal/analyzers/analyze/record_reader.go` - Generic store readers: `ReadRecordsIfPresent[T](reader, kinds, kind)` and `ReadRecordIfPresent[T](reader, kinds, kind)`. Used by all 10 analyzer store_reader.go files
 - `internal/analyzers/analyze/record_writer.go` - Generic store writer: `WriteSliceKind[T](w, kind, records)`. Used by devs, anomaly, quality, sentiment, typos, file_history, couples store_writer.go
-- `internal/analyzers/analyze/typed_collection.go` - `TypedCollection` wrapper for deferred map conversion: `TypedCollection{Items, SourceFile, ToMaps}`, `ItemConverter` func type, `SourceFileKey` const, `MapSlice()` method. Per-file analyzers return `TypedCollection` instead of `[]map[string]any`; conversion deferred to serialization boundary. FRD: specs/frds/FRD-20260311-typed-report-items.md
+- `internal/analyzers/analyze/typed_collection.go` - `TypedCollection` wrapper for deferred map conversion: `TypedCollection{Items, SourceFile, Language, Directory, ToMaps}`, `ItemConverter` func type, `SourceFileKey`/`LanguageKey`/`DirectoryKey` consts, `MapSlice()` method. Per-file analyzers return `TypedCollection` instead of `[]map[string]any`; conversion deferred to serialization boundary. `DetailedDataCollector.buildItems()` calls `stampCollectionMetadata()` to propagate Language and Directory to converted maps. FRD: specs/frds/FRD-20260311-typed-report-items.md
+- `internal/analyzers/analyze/metadata.go` - `AnalysisMetadata` struct (`RepoPath`, `RepoName`, `AnalyzedAt`, `CodefangVersion`), `NewAnalysisMetadata(repoPath)` constructor. Injected into `UnifiedModel.Metadata` after `DecodeCombinedBinaryReports`. FRD: specs/frds/FRD-20260408-output-metadata.md
+- `internal/analyzers/analyze/tick_bounds.go` - `TickBounds{StartTime, EndTime}` type with `FormatStartTime()`/`FormatEndTime()` (RFC 3339), `BuildTickBounds(ticks []TICK) map[int]TickBounds`. Used by all history analyzers to export tick timestamps. FRD: specs/frds/FRD-20260408-tick-timestamps.md
+- `internal/analyzers/analyze/schema_registry.go` - `FieldMeta{Type, Grain, Description}`, `AnalyzerSchema` (map alias), `SchemaForAnalyzer(id) AnalyzerSchema`. Static registry covering all 17 analyzers with type (list/aggregate/time_series/risk/scalar) and grain (function/file/tick/pair/developer). FRD: specs/frds/FRD-20260408-schema-manifest.md
+- `internal/identity/split.go` - `SplitIdentity(s string) (name, email string)`. Handles pipe-delimited (`"alice|alice@example.com"`), exact (`"alice <alice@example.com>"`), and plain name formats. Used by devs and couples analyzers. FRD: specs/frds/FRD-20260408-normalize-developer-identity.md
 - `internal/analyzers/analyze/analyzer.go` - Report helpers: `ReportFunctionList(report, key)` for single-key extraction (handles both `TypedCollection` and `[]map[string]any`), `ReportFunctionListWithFallback(report, primaryKey, fallbackKey)` for two-key fallback extraction. Used by complexity, halstead, cohesion, comments plot.go
 - `internal/analyzers/common/reportutil/reportutil.go` - Type-safe report accessors: `GetAs[T any](report, key) (T, bool)` (generic base, pure type assertion), `GetFloat64`/`GetInt` (safeconv coercion — handles cross-type), `GetString`/`GetStringSlice`/`GetStringIntMap`/`GetFunctions`/`MapString` (delegate to `GetAs`), `FormatInt`/`FormatFloat`/`FormatPercent`/`Pct`. `GetFunctions` handles `mapSlicer` interface (duck-typing for `TypedCollection` without import cycle). FRD: specs/frds/FRD-20260306-reportutil-getas.md
 
diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000..5814392
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,393 @@
+# Changelog
+
+All notable changes to the Codefang project are documented in this file.
+The format follows [Keep a Changelog](https://keepachangelog.com/).
+
+---
+
+## [Unreleased] — Repo hygiene & race fix
+
+### Fixed
+
+- **Race in `internal/framework.PipelineSampler`**:
+  `t1Captured` was a plain `bool` concurrently read by the sampler
+  goroutine (`sample`) and written by the caller (`CaptureT1`),
+  causing intermittent `DATA RACE` under `go test -race`. Converted
+  to `sync/atomic.Bool` with `CompareAndSwap` — at most one t1 heap
+  profile is captured regardless of which goroutine observes the
+  trigger first. Removed the unused `t0Captured` field. Full
+  `go test -race ./...` now green.
+
+### Chore
+
+- **Removed `// FRD: specs/frds/FRD-...md` comments from all `.go`
+  files.** `specs/` is gitignored, so those references broke for
+  anyone cloning the repo. Traceability now lives in FRDs and
+  PR descriptions instead of source code.
+
+---
+
+## [Unreleased] — Cross-phase defaults: vendor & generated excluded
+
+**Breaking change.** Default analysis output across both phases
+now **excludes vendor and generated files** — matching the
+convention of every mature multi-language analyser (eslint skips
+`node_modules/`, rubocop skips `vendor/`, pylint skips `.venv/`,
+scalafix skips `target/`, phpcs skips `vendor/`, GitHub Linguist
+excludes vendor/generated from its language breakdown). Users who
+want the pre-2026-04 behaviour back pass `--include-vendored
+--include-generated` in their invocation.
+
+### Flags (cross-phase)
+
+- `--include-vendored` (bool, default `false`) — re-include paths
+  detected as vendored by enry / Linguist. Covers `vendor/`,
+  `node_modules/`, `third_party/`, `testdata/`, `dist/`,
+  minified bundles, and more. Cross-language by construction.
+- `--include-generated` (bool, default `false`) — re-include
+  auto-generated files. Covers `*.pb.go`, `zz_generated_*.go`,
+  `*_pb2.py`, `*.min.js`, and content-header markers
+  (`DO NOT EDIT`, `Code generated`, `@generated`, …).
+- `--extra-excluded-prefixes` (strings, default `[]`) — additional
+  UNIX path prefixes to exclude, for ecosystems enry doesn't know
+  about (e.g. `.venv/`, `target/`, `.gradle/`).
+
+All three flags apply identically to both `-a 'static/*'` and `-a
+'history/*'` runs — one flag set, one meaning.
+
+### Deprecated
+
+- `--skip-blacklist` — now a no-op (the new default already excludes
+  vendor and generated). Cobra deprecation warning fires when the
+  flag is passed.
+- `--blacklisted-prefixes` — migrate to `--extra-excluded-prefixes`
+  (identical semantics). Cobra deprecation warning fires when the
+  flag is passed.
+
+Both will be removed in the next minor release.
+
+### Architecture
+
+New package `internal/analyzers/plumbing/pathpolicy` exposing a pure
+`Exclude(path, content, opts) bool` backed by enry.IsVendor +
+`pkg/pathfilter`'s content-aware generated-file detection. Both
+phases call the same helper — single source of truth, no
+phase-specific drift.
+
+### Measured impact (cross-language fixture, `-a static/complexity`)
+
+| Invocation                                        | Total Functions |
+| ------------------------------------------------- | --------------: |
+| *(defaults)*                                      | 1               |
+| `--include-vendored`                              | 4               |
+| `--include-vendored --include-generated`          | 5               |
+
+---
+
+## [Unreleased] — Cross-phase consistency for `--languages`
+
+**Motivation**: After the history-side push-down, `--languages` meant
+different things depending on `-a 'history/*'` vs `-a 'static/*'`. Static
+analysis silently ignored the flag — every UAST-supported file was parsed
+and fed to every requested static analyzer regardless of the user's
+preference. This release makes the flag cross-phase: one flag, one
+meaning, both phases narrowed.
+
+### Changes
+
+- **`StaticService.LanguageGlobs`** — new field on the static service,
+  populated from `--languages` via the existing
+  `internal/analyzers/plumbing/langpath` single source of truth. Empty
+  disables the filter (default behavior unchanged).
+- **Path-based walker hooks** — both `StaticService.streamFiles` (UAST
+  walker) and `StaticService.rawFilePhase` visit-check the basename
+  against the glob set via `matchesLanguageGlobs` before sending the
+  path downstream. Filtered files never reach the UAST parser or any
+  analyzer.
+- **Runtime wiring** — `runStaticAnalyzers` and `runStaticPlotAnalyzers`
+  build the globs via a shared `applyStaticLanguageFilter` helper.
+  Unknown language tokens fail fast on static-only runs with the same
+  error shape as the history side.
+- **Executor signatures** — `staticExecutor` and `staticPlotExecutor`
+  gain a `languages []string` parameter; test stubs updated
+  mechanically.
+
+### Non-goals
+
+- No content-aware post-pass on the static side (the UAST parser's
+  own language router is the final authority for matched files; a
+  second pass would duplicate work).
+- No changes to the history side.
+
+---
+
+## [Unreleased] — Performance: `--languages` filter push-down into libgit2
+
+**Motivation**: The `--languages` flag used to be applied *after* libgit2 had
+already produced a full tree diff. Every delta crossed the cgo boundary, was
+materialised in Go, and only then dropped by the analyzer if its detected
+language wasn't in the allow-list. On polyglot repositories with a narrow
+filter, libgit2 was doing 4× the tree-diff work it needed to.
+
+### Changes
+
+- **New package `internal/analyzers/plumbing/langpath`** — pure Go
+  `Globs(langs []string) (globs []string, wantsAll bool, err error)` backed
+  by enry's generated Linguist dataset (`data.ExtensionsByLanguage` +
+  `data.LanguagesByFilename`). Single source of truth; 100 % test coverage.
+- **New C ABI `cf_tree_diff_v2`** in `pkg/gitlib/clib/{codefang_git.h,diff_ops.c}`
+  accepts a pathspec array which it forwards to libgit2's
+  `git_diff_options.pathspec`. The old `cf_tree_diff` is retired in favour of
+  `cf_tree_diff_v2` via `CGOBridge.TreeDiffWithPathspec`.
+- **`TreeDiffRequest.Pathspec` + `BlobPipeline.TreeDiffPathspec` +
+  `CoordinatorConfig.TreeDiffPathspec`** thread the pathspec from the
+  analyzer through the pipeline to every worker call.
+- **`TreeDiffAnalyzer.Pathspec` + `applyLanguageConfig`** resolve aliases via
+  `enry.GetLanguageByAlias` (so `--languages golang` / `js` / `ts` now work,
+  not just canonical Linguist names) and pre-compute the pathspec at
+  `Configure` time.
+- **Fail-fast on unknown languages**: `--languages notalang` now returns
+  `failed to configure TreeDiff: tree-diff pathspec: unknown language: "notalang"`
+  instead of silently producing an empty report.
+
+### Measured impact
+
+On a 500-commit × 200-file × 4-language synthetic fixture with
+`--languages go`:
+
+| Metric                      | Before  | After   | Δ      |
+| --------------------------- | ------: | ------: | -----: |
+| Wall time                   | 0.44 s  | 0.29 s  | −34 %  |
+| Max RSS                     | 74 MB   | 66 MB   | −11 %  |
+| `cgocall` cumulative CPU    | 800 ms  | 510 ms  | −36 %  |
+| Unique functions in profile | 286     | 209     | −27 %  |
+| JSON report                 |    —    |    —    | byte-identical |
+
+Regression guard (no `--languages` filter): wall time 0.51 s → 0.49 s,
+within noise.
+
+### Non-goals (for this changeset)
+
+- No new user flags.
+- The Go-side `shouldIncludeChange` language filter remains as the precise
+  post-pass (pathspec is deliberately over-inclusive for
+  content-disambiguated extensions such as `.h`, `.pl`, `.m`, `.r`).
+
+---
+
+## [Unreleased] — Analytics Readiness & DWH Suitability
+
+**Motivation**: A comprehensive data analyst review of Codefang's JSON output revealed that while the data was analytically rich (17 analyzers, 1M+ function-level rows, time-series, coupling data), it was structurally hostile to analytics tooling and DWH loading. Function records had bare names with no file paths, time-series ticks had no calendar dates, developer identities used pipe-delimited strings, and nested maps blocked efficient columnar ingestion. This release systematically fixes every identified blocker, raising the data quality score from **2.1/5 to 4.6/5**.
+
+### Architecture: Pipeline Stage Refactor
+
+#### `RawFileAnalyzer` and `FormattableAnalyzer` interfaces
+
+Replaced the `FileContentAnalyzer` + `WalksAllFiles` marker interface pattern with a proper pipeline stage architecture.
+
+**Before**: Analyzers that needed raw file access (not UAST) had to implement `StaticAnalyzer` with a no-op `Analyze(*node.Node)`, plus two marker interfaces discovered at runtime via type assertions.
+
+**After**: Two clean interface hierarchies — `StaticAnalyzer` for UAST-based analysis and `RawFileAnalyzer` for raw file analysis — both embed a shared `FormattableAnalyzer` base. `StaticService` holds separate slices. `AnalyzeFolder` uses `pipeline.RunPhases` with explicit `rawFilePhase` and `uastPhase` stages.
+
+**Why it matters for BI**: The pipeline refactor enabled `StampSourceFile` to receive `rootPath` and convert all file paths to relative — a prerequisite for portable DWH data. It also enabled `StampLanguage` to inject detected language into every function record.
+
+**Files changed**:
+- `internal/analyzers/analyze/analyzer.go` — new `FormattableAnalyzer`, `RawFileAnalyzer` interfaces; `StaticAnalyzer` refactored to embed `FormattableAnalyzer`
+- `internal/analyzers/analyze/static.go` — `StaticService` gains `UASTAnalyzers` + `RawFileAnalyzers` slices; `AnalyzeFolder` uses `pipeline.RunPhases`
+- `internal/analyzers/composition/analyzer.go` — implements `RawFileAnalyzer` directly (removed no-op `Analyze`, `NeedsAllFiles`)
+- `internal/analyzers/analyze/registry.go` — `NewRegistry` accepts three slices
+- `cmd/codefang/commands/run.go` — split `defaultStaticAnalyzers` into `defaultUASTAnalyzers` + `defaultRawFileAnalyzers`
+- `internal/analyzers/analyze/perfile.go` — `PerFileEnricher` uses `[]FormattableAnalyzer`
+- `internal/analyzers/common/renderer/json.go` — `EnrichWithPerFileData` uses `[]FormattableAnalyzer`
+
+---
+
+### Static Analyzers: New Fields on Every Function Record
+
+#### `source_file` — File path on every function record
+
+**Motivation**: 152,000+ function records in the JSON output had bare names like `"ForKind"` with no indication of which file they belonged to. This made it impossible to join function metrics to file-level data, build file heatmaps, or drill down from "bad function" to "where in the repo."
+
+**Root cause**: The `_source_file` stamping mechanism existed and worked through aggregation, but `FormatReportBinary` called `ComputeAllMetrics` which parsed `[]map[string]any` items into typed structs. Those structs had no `SourceFile` field, silently dropping the value during struct conversion.
+
+**Fix**: Added `SourceFile string` to all input `FunctionData` and output data structs (`FunctionComplexityData`, `FunctionHalsteadData`, `FunctionCohesionData`, all comment data structs, `HighRiskFunctionData`, `HighEffortFunctionData`, `LowCohesionFunctionData`, `UndocumentedFunctionData`). Populated from `_source_file` map key during `parseFunctionData` → `Compute()`. Updated `StampSourceFile` to accept `rootPath` and convert to relative via `MakeRelativePath`.
+
+**JSON output key**: `"source_file"` (relative path, e.g., `"pkg/kubelet/kubelet.go"`)
+
+**Analyzers affected**: `static/complexity`, `static/halstead`, `static/cohesion`, `static/comments`
+
+#### `language` — Programming language on every function record
+
+**Motivation**: Analysts had to infer language from file extension at query time. The parser already knows the language.
+
+**Fix**: Added `LanguageKey` constant, `StampLanguage()` function, and `Language` field to `TypedCollection` struct. Language is stamped in `analyzeFilesParallel` via `parser.GetLanguage(filePath)` and propagated through `TypedCollection` → `DetailedDataCollector.buildItems()` → `stampCollectionMetadata()` to reach the output structs.
+
+**JSON output key**: `"language"` (e.g., `"go"`, `"bash"`)
+
+**Analyzers affected**: `static/complexity`, `static/halstead`, `static/cohesion`, `static/comments`
+
+#### `directory` — Parent directory on every function record
+
+**Motivation**: Directory-level aggregation (e.g., "which package has worst complexity") requires parsing file paths at query time, which is expensive in columnar DWH.
+
+**Fix**: Added `DirectoryKey` constant and `Directory` field to `TypedCollection`. Stamped as `filepath.Dir(relativePath)` inside `StampSourceFile`. Propagated via `stampCollectionMetadata()` alongside language.
+
+**JSON output key**: `"directory"` (e.g., `"pkg/kubelet"`)
+
+**Analyzers affected**: `static/complexity`, `static/halstead`, `static/cohesion`, `static/comments`
+
+---
+
+### History Analyzers: Tick Timestamps
+
+#### `start_time` / `end_time` on every time-series tick
+
+**Motivation**: All 6 history time-series analyzers emitted `tick: <int>` with no calendar date. Every time-series chart had an unlabeled X-axis. The `TICK` struct already carried `StartTime`/`EndTime` internally but didn't export them.
+
+**Fix**: Created `TickBounds` type and `BuildTickBounds(ticks []TICK)` helper. Each analyzer's `ticksToReport` adds `tick_bounds` to the Report. Each `ParseReportData` reads it. Each time-series output struct gains `StartTime`/`EndTime` string fields (RFC 3339). For quality and devs analyzers, added timestamp tracking to their tick accumulators (`tickAccumulator.startTime/endTime`, `TickDevData.startTime/endTime`) with min/max tracking in `extractTC` and population in `buildTick`.
+
+**JSON output keys**: `"start_time"`, `"end_time"` (RFC 3339, e.g., `"2024-01-15T10:30:00Z"`)
+
+**Analyzers affected**: `history/sentiment`, `history/anomaly`, `history/quality`, `history/devs` (activity + churn), `history/file-history` (composition_ts)
+
+---
+
+### Developer Identity Normalization
+
+#### Split pipe-delimited names into `name` + `email`
+
+**Motivation**: Developer identity used `"daniel smith|dbsmith@google.com"` pipe-delimited strings from `ReversedPeopleDict`. This blocked clean dimension table creation in DWH systems.
+
+**Fix**: Created `SplitIdentity(s string) (name, email string)` in `internal/identity/split.go`. Handles pipe-delimited, exact `"name <email>"`, and plain name formats. Updated `devName()` → `devNameAndEmail()` and `getDevName()` → `getDevNameAndEmail()`.
+
+**Fields added**:
+- `DeveloperData`: `email` field
+- `BusFactorData`: `primary_dev_email`, `secondary_dev_email`
+- `DeveloperCouplingData`: `developer1_email`, `developer2_email`
+
+**Analyzers affected**: `history/devs`, `history/couples`
+
+---
+
+### Output Structure: Flattened Arrays
+
+#### `developers[].languages` — map → array
+
+**Motivation**: `map[string]LineStats` with variable language-name keys cannot be UNNEST'd in columnar DWH without custom ETL.
+
+**Fix**: Changed `DeveloperData.Languages` from `map[string]pkgplumbing.LineStats` to `[]LanguageStatsEntry`. Internal accumulation uses unexported `langMap`, converted to sorted array via `finalizeLanguages()`. Empty language strings replaced with `"Other"`.
+
+**Before**: `{"Go": {"added": 100, "removed": 5, "changed": 3}}`
+**After**: `[{"language": "Go", "added": 100, "removed": 5, "changed": 3}]`
+
+#### `activity[].by_developer` — map → array
+
+**Motivation**: `map[int]int` (dev_id → commit_count) serializes to JSON with string keys, blocking typed ingestion.
+
+**Fix**: Changed to `[]DeveloperCommits` with `{dev_id, commits}` fields. Sorted by dev_id for deterministic output.
+
+**Before**: `{"2": 5, "3": 3}`
+**After**: `[{"dev_id": 2, "commits": 5}, {"dev_id": 3, "commits": 3}]`
+
+#### `file_contributors[].contributors` — map → array
+
+**Motivation**: `map[int]LineStats` blocked DWH UNNEST.
+
+**Fix**: Changed to `[]ContributorEntry` with `{dev_id, added, removed, changed}` fields. Sorted by dev_id.
+
+**Before**: `{"2": {"added": 42, "removed": 5, "changed": 3}}`
+**After**: `[{"dev_id": 2, "added": 42, "removed": 5, "changed": 3}]`
+
+---
+
+### Output Envelope
+
+#### Top-level `metadata` section
+
+**Motivation**: A DWH ingesting reports from multiple repos could not distinguish them. No repo name, analysis timestamp, or version.
+
+**Fix**: Added `AnalysisMetadata` struct with `repo_path`, `repo_name` (from `filepath.Base`), `analyzed_at` (RFC 3339), `codefang_version` (from build ldflags). Injected after `DecodeCombinedBinaryReports` in the combined render path.
+
+```json
+{
+  "version": "codefang.run.v1",
+  "metadata": {
+    "repo_path": "/home/user/sources/kubernetes",
+    "repo_name": "kubernetes",
+    "analyzed_at": "2026-04-07T23:33:00Z",
+    "codefang_version": "dev"
+  },
+  "analyzers": [...]
+}
+```
+
+#### Per-analyzer `schema` manifest
+
+**Motivation**: DWH consumers need to know field types, grain, and cardinality for automated ETL generation.
+
+**Fix**: Added `FieldMeta` struct with `{type, grain, description}` and static `analyzerSchemas` registry covering all 17 analyzers. Each `AnalyzerResult` in the output includes a `schema` field.
+
+```json
+{
+  "id": "static/complexity",
+  "schema": {
+    "function_complexity": {
+      "type": "list",
+      "grain": "function",
+      "description": "Per-function cyclomatic and cognitive complexity"
+    }
+  },
+  "report": {...}
+}
+```
+
+#### NDJSON output format
+
+**Motivation**: The monolithic JSON (467MB for kubernetes) must be fully parsed to extract any single analyzer. NDJSON enables streaming ingestion into ClickHouse.
+
+**Fix**: Added `FormatNDJSON` case to `WriteConvertedOutput`. One JSON line per analyzer result, with optional metadata line prepended.
+
+```bash
+codefang run --format ndjson /repo > output.ndjson
+```
+
+---
+
+### Clone Analysis
+
+#### `clone_type_distribution` from full population
+
+**Motivation**: Clone pairs are capped at 1,000 in the output, but the distribution metrics (Type-1/2/3 breakdown) were computed from the capped sample, skewing percentages for large codebases with 22M+ total pairs.
+
+**Fix**: Added `typeDistribution cloneTypeCounts` to `clonePairResult`. `matchCandidates` increments per-type counters for ALL valid pairs before the cap check. Both aggregator and per-file paths emit `clone_type_distribution` in the report. `ReportSection.Distribution()` reads from the full-population distribution.
+
+**Before**: Distribution from 1,000 capped pairs
+**After**: Distribution from 22,381,694 total pairs: `{"Type-1": 12366266, "Type-2": 3307147, "Type-3": 6708281}`
+
+#### Relative paths in clone pairs
+
+Clone pair `func_a` / `func_b` paths changed from absolute (`/home/user/sources/repo/file.go::funcName`) to relative (`cmd/controller/app.go::newController`). Enabled by the `StampSourceFile` rootPath change.
+
+---
+
+### New Files Created
+
+| File | Purpose |
+|------|---------|
+| `internal/analyzers/analyze/tick_bounds.go` | `TickBounds` type + `BuildTickBounds` helper |
+| `internal/analyzers/analyze/metadata.go` | `AnalysisMetadata` struct + `NewAnalysisMetadata` constructor |
+| `internal/analyzers/analyze/schema_registry.go` | Static schema registry for all 17 analyzers |
+| `internal/identity/split.go` | `SplitIdentity(s string) (name, email string)` |
+
+---
+
+### Empty Analyzer Root Causes (Documented)
+
+Investigation of 4 analyzers that returned empty data on kubernetes (1000 commits):
+
+| Analyzer | Root Cause | Resolution |
+|----------|-----------|------------|
+| `burndown.developer_survival` | Disabled by default (`Burndown.TrackPeople: false`) | Enable via config |
+| `burndown.file_survival` | Disabled by default (`Burndown.TrackFiles: false`) | Enable via config |
+| `history/imports` | Requires UAST-enabled pipeline mode (`NeedsUAST() = true`) | Architectural dependency |
+| `history/typos` | Requires UAST-enabled pipeline mode (`NeedsUAST() = true`) | Architectural dependency |
diff --git a/Makefile b/Makefile
index b23122e..78bb1b8 100644
--- a/Makefile
+++ b/Makefile
@@ -37,8 +37,9 @@ help:
 	@echo "  build            - Build all binaries (alias for all)"
 	@echo "  libgit2          - Build vendored libgit2 statically (auto-built by 'all')"
 	@echo "  install          - Install binaries to system PATH"
-	@echo "  test             - Run all tests"
-	@echo "  lint             - Run linters and deadcode analysis"
+	@echo "  test             - Run all tests (unit)"
+	@echo "  test-e2e         - Run e2e acceptance tests (RUN=<regex> to filter)"
+	@echo "  lint             - Run linters, deadcode, and orphan package detection"
 	@echo "  fmt              - Format code"
 	@echo "  schemas          - Generate JSON schemas for all analyzers"
 	@echo "  deadcode         - Run deadcode analysis with detailed output"
@@ -111,6 +112,17 @@ testv: all
 	CGO_LDFLAGS="-L$(CURDIR)/$(LIBGIT2_INSTALL)/lib64 -L$(CURDIR)/$(LIBGIT2_INSTALL)/lib -lgit2 -lpthread" \
 	CGO_ENABLED=1 go test ./... -v
 
+# Run end-to-end acceptance tests (tests/e2e/).
+# Add new spec tests by dropping *_test.go files into tests/e2e/.
+# Optional: RUN=<regex> to filter, e.g. make test-e2e RUN=TestPerFile
+RUN ?= .
+.PHONY: test-e2e
+test-e2e: libgit2
+	PKG_CONFIG_PATH=$(LIBGIT2_PKG_CONFIG) \
+	CGO_CFLAGS="-I$(CURDIR)/$(LIBGIT2_INSTALL)/include" \
+	CGO_LDFLAGS="-L$(CURDIR)/$(LIBGIT2_INSTALL)/lib64 -L$(CURDIR)/$(LIBGIT2_INSTALL)/lib -lgit2 -lpthread" \
+	CGO_ENABLED=1 go test -tags e2e -count=1 -v -run $(RUN) ./tests/e2e/...
+
 # Run UAST performance benchmarks (comprehensive suite with organized results)
 bench: all
 	python3 tools/benchmark/benchmark_runner.py
@@ -302,13 +314,15 @@ lint:
 	CGO_ENABLED=1 $(GOLINT) run $(INTERNAL_PKGS)
 	@echo "Running deadcode analysis (production)..."
 	@GOCACHE=$(LINT_GOCACHE) ./scripts/deadcode-filter.sh $(DEADCODE_PKGS)
+	@echo "Running orphan package detection..."
+	@./scripts/orphan-packages.sh $(INTERNAL_PKGS)
 	@echo "✓ Linting complete"
 
 ## deadcode: Run deadcode analysis with whitelist filter (fails if dead code found)
 .PHONY: deadcode
 deadcode:
 	@echo "Running deadcode analysis with whitelist..."
-	@GOCACHE=$(LINT_GOCACHE) ./scripts/deadcode-filter.sh -test $(DEADCODE_PKGS)
+	@GOCACHE=$(LINT_GOCACHE) ./scripts/deadcode-filter.sh $(DEADCODE_PKGS)
 
 ## deadcode-prod: Run deadcode analysis excluding tests (production-only dead code)
 .PHONY: deadcode-prod
diff --git a/cmd/codefang/commands/render.go b/cmd/codefang/commands/render.go
index 29b937e..b79ffee 100644
--- a/cmd/codefang/commands/render.go
+++ b/cmd/codefang/commands/render.go
@@ -3,8 +3,10 @@ package commands
 import (
 	"errors"
 	"fmt"
+	"io"
 	"log/slog"
 	"os"
+	"path/filepath"
 	"strings"
 
 	"github.com/spf13/cobra"
@@ -26,6 +28,8 @@ import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/sentiment"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/shotness"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/typos"
+	"github.com/Sumatoshi-tech/codefang/internal/storage"
+	"github.com/Sumatoshi-tech/codefang/pkg/textutil"
 )
 
 const (
@@ -135,7 +139,33 @@ func runRender(storeDir, outputDir string) error {
 		return fmt.Errorf("render index: %w", indexErr)
 	}
 
-	return nil
+	return writeRenderReportJSON(outputDir, analyzerIDs, pages)
+}
+
+// renderReportJSONFilename is the name of the machine-readable JSON report.
+const renderReportJSONFilename = "report.json"
+
+// renderReportJSONPerm is the file permission for report.json.
+const renderReportJSONPerm = 0o640
+
+// renderReportData is the JSON structure emitted by codefang render.
+type renderReportData struct {
+	AnalyzerIDs []string            `json:"analyzer_ids"`
+	Pages       []plotpage.PageMeta `json:"pages"`
+}
+
+// writeRenderReportJSON emits report.json alongside rendered HTML pages.
+func writeRenderReportJSON(outputDir string, analyzerIDs []string, pages []plotpage.PageMeta) error {
+	reportPath := filepath.Join(outputDir, renderReportJSONFilename)
+
+	data := renderReportData{
+		AnalyzerIDs: analyzerIDs,
+		Pages:       pages,
+	}
+
+	return storage.WriteAtomic(reportPath, renderReportJSONPerm, func(w io.Writer) error {
+		return textutil.WriteJSON(w, data, true)
+	})
 }
 
 func renderOneAnalyzer(
diff --git a/cmd/codefang/commands/render_test.go b/cmd/codefang/commands/render_test.go
index 418d482..0240477 100644
--- a/cmd/codefang/commands/render_test.go
+++ b/cmd/codefang/commands/render_test.go
@@ -1,7 +1,5 @@
 package commands
 
-// FRD: specs/frds/FRD-20260228-render-command.md.
-
 import (
 	"os"
 	"path/filepath"
diff --git a/cmd/codefang/commands/run.go b/cmd/codefang/commands/run.go
index 18c3875..a5d5ccc 100644
--- a/cmd/codefang/commands/run.go
+++ b/cmd/codefang/commands/run.go
@@ -32,12 +32,15 @@ import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/plotpage"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/complexity"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/composition"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/couples"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/devs"
 	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/halstead"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/imports"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/langpath"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/quality"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/sentiment"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/shotness"
@@ -60,8 +63,11 @@ type staticExecutor func(
 	format string,
 	verbose bool,
 	noColor bool,
+	perFile bool,
 	maxWorkers int,
 	memoryBudget int64,
+	languages []string,
+	pathPolicy pathpolicy.Options,
 	writer io.Writer,
 ) error
 
@@ -70,6 +76,8 @@ type staticPlotExecutor func(
 	analyzerIDs []string,
 	maxWorkers int,
 	memoryBudget int64,
+	languages []string,
+	pathPolicy pathpolicy.Options,
 	outputDir string,
 ) error
 
@@ -95,19 +103,23 @@ type HistoryRunOptions struct {
 	Head        bool
 	Since       string
 
-	Workers         int
-	BufferSize      int
-	CommitBatchSize int
-	BlobCacheSize   string
-	DiffCacheSize   int
-	BlobArenaSize   string
-	MemoryBudget    string
+	Workers             int
+	BufferSize          int
+	CommitBatchSize     int
+	BlobCacheSize       string
+	DiffCacheSize       int
+	BlobArenaSize       string
+	MemoryBudget        string
+	MaxChangesPerCommit int
 
 	Checkpoint      *bool
 	CheckpointDir   string
 	Resume          *bool
 	ClearCheckpoint bool
 
+	CacheDir string
+	NoCache  bool
+
 	DebugTrace bool
 	NDJSON     bool
 
@@ -157,16 +169,19 @@ type RunCommand struct {
 	head        bool
 	since       string
 
-	workers         int
-	bufferSize      int
-	commitBatchSize int
-	blobCacheSize   string
-	diffCacheSize   int
-	blobArenaSize   string
-	memoryBudget    string
+	workers             int
+	bufferSize          int
+	commitBatchSize     int
+	blobCacheSize       string
+	diffCacheSize       int
+	blobArenaSize       string
+	memoryBudget        string
+	maxChangesPerCommit int
 
 	checkpointDir   string
 	clearCheckpoint bool
+	cacheDir        string
+	noCache         bool
 
 	ndjson bool
 
@@ -175,6 +190,12 @@ type RunCommand struct {
 	diagnosticsAddr string
 
 	staticWorkers int
+	perFile       bool
+
+	// Cross-phase path exclusion policy.
+	includeVendored       bool
+	includeGenerated      bool
+	extraExcludedPrefixes []string
 
 	plotOutput string
 	keepStore  bool
@@ -269,17 +290,23 @@ func newRunCommandWithAllDeps(
 
 	cmd.Flags().IntVar(&rc.workers, "workers", 0, "Number of parallel workers (0 = use CPU count)")
 	cmd.Flags().IntVar(&rc.staticWorkers, "static-workers", 0, "Number of parallel static analysis workers (0 = min(CPU count, 8))")
+	rc.registerExclusionFlags(cmd)
+
+	cmd.Flags().BoolVarP(&rc.perFile, "per-file", "F", false,
+		"Include per-file breakdowns and summary statistics in static output")
 	cmd.Flags().IntVar(&rc.bufferSize, "buffer-size", 0, "Size of internal pipeline channels (0 = workers*2)")
 	cmd.Flags().IntVar(&rc.commitBatchSize, "commit-batch-size", 0, "Commits per processing batch (0 = default 100)")
 	cmd.Flags().StringVar(&rc.blobCacheSize, "blob-cache-size", "", "Max blob cache size (e.g., '256MB', '1GB'; empty = default 1GB)")
 	cmd.Flags().IntVar(&rc.diffCacheSize, "diff-cache-size", 0, "Max diff cache entries (0 = default 10000)")
 	cmd.Flags().StringVar(&rc.blobArenaSize, "blob-arena-size", "", "Memory arena size for blob loading (e.g., '4MB'; empty = default 4MB)")
 	cmd.Flags().StringVar(&rc.memoryBudget, "memory-budget", "", "Memory budget for auto-tuning (e.g., '512MB', '2GB')")
+	cmd.Flags().IntVar(&rc.maxChangesPerCommit, "max-changes-per-commit", 0,
+		"Skip commits whose tree diff exceeds this many changes (0 = default 10000). "+
+			"Commits over the cap are silently dropped from history, which can desync "+
+			"burndown's tracked state for affected files. Raise on monorepos with "+
+			"legitimate large commits (Pods updates, generated code dumps).")
 
-	cmd.Flags().Bool("checkpoint", true, "Enable checkpointing for crash recovery")
-	cmd.Flags().StringVar(&rc.checkpointDir, "checkpoint-dir", "", "Checkpoint directory (default: ~/.codefang/checkpoints)")
-	cmd.Flags().Bool("resume", true, "Resume from checkpoint if available")
-	cmd.Flags().BoolVar(&rc.clearCheckpoint, "clear-checkpoint", false, "Clear existing checkpoint before run")
+	rc.registerPersistenceFlags(cmd)
 
 	cmd.Flags().StringVar(&rc.configFile, "config", "", "Configuration file path (default: .codefang.yaml in CWD or $HOME)")
 	cmd.Flags().BoolVar(&rc.listAnalyzers, "list-analyzers", false, "List all available analyzer IDs and exit")
@@ -524,7 +551,7 @@ func (rc *RunCommand) runDirect(
 		return rc.renderCombinedDirect(ctx, path, staticIDs, historyIDs, staticFormat, silent, progressWriter, writer, cmd)
 	}
 
-	err = rc.runStaticPhase(path, staticIDs, staticFormat, silent, progressWriter, writer)
+	err = rc.runStaticPhase(path, staticIDs, staticFormat, silent, progressWriter, writer, cmd)
 	if err != nil {
 		return err
 	}
@@ -548,6 +575,7 @@ func (rc *RunCommand) runStaticPhase(
 	silent bool,
 	progressWriter io.Writer,
 	writer io.Writer,
+	cmd *cobra.Command,
 ) error {
 	if len(staticIDs) == 0 {
 		return nil
@@ -567,12 +595,21 @@ func (rc *RunCommand) runStaticPhase(
 
 	rc.progressf(silent, progressWriter, "static phase started (%d analyzers)", len(staticIDs))
 
+	languages := readLanguagesFlag(cmd)
+	policy := rc.buildPathPolicy()
+
 	var err error
 
 	if staticFormat == analyze.FormatPlot {
-		err = rc.staticPlotExec(path, staticIDs, rc.staticWorkers, budgetBytes, rc.plotOutput)
+		err = rc.staticPlotExec(
+			path, staticIDs, rc.staticWorkers, budgetBytes, languages, policy, rc.plotOutput,
+		)
 	} else {
-		err = rc.staticExec(path, staticIDs, staticFormat, rc.verbose, rc.noColor, rc.staticWorkers, budgetBytes, writer)
+		err = rc.staticExec(
+			path, staticIDs, staticFormat,
+			rc.verbose, rc.noColor, rc.perFile,
+			rc.staticWorkers, budgetBytes, languages, policy, writer,
+		)
 	}
 
 	if err != nil {
@@ -619,6 +656,25 @@ func (rc *RunCommand) runHistoryPhase(
 	return nil
 }
 
+// combinedIDsAndModes builds parallel slices of analyzer IDs and their modes
+// (static first, history second) for DecodeCombinedBinaryReports.
+func combinedIDsAndModes(staticIDs, historyIDs []string) ([]string, []analyze.AnalyzerMode) {
+	ids := make([]string, 0, len(staticIDs)+len(historyIDs))
+	modes := make([]analyze.AnalyzerMode, 0, len(staticIDs)+len(historyIDs))
+
+	for _, id := range staticIDs {
+		ids = append(ids, id)
+		modes = append(modes, analyze.ModeStatic)
+	}
+
+	for _, id := range historyIDs {
+		ids = append(ids, id)
+		modes = append(modes, analyze.ModeHistory)
+	}
+
+	return ids, modes
+}
+
 func (rc *RunCommand) renderCombinedDirect(
 	ctx context.Context,
 	path string,
@@ -640,7 +696,8 @@ func (rc *RunCommand) renderCombinedDirect(
 
 	err := rc.staticExec(
 		path, staticIDs, analyze.FormatBinary,
-		rc.verbose, rc.noColor, rc.staticWorkers, budgetBytes, &raw,
+		rc.verbose, rc.noColor, rc.perFile, rc.staticWorkers, budgetBytes,
+		readLanguagesFlag(cmd), rc.buildPathPolicy(), &raw,
 	)
 	if err != nil {
 		return fmt.Errorf("render combined static phase: %w", err)
@@ -661,24 +718,15 @@ func (rc *RunCommand) renderCombinedDirect(
 
 	rc.progressf(silent, progressWriter, "combined history phase finished in %s", time.Since(startedAt).Round(time.Millisecond))
 
-	ids := make([]string, 0, len(staticIDs)+len(historyIDs))
-	modes := make([]analyze.AnalyzerMode, 0, len(staticIDs)+len(historyIDs))
-
-	for _, id := range staticIDs {
-		ids = append(ids, id)
-		modes = append(modes, analyze.ModeStatic)
-	}
-
-	for _, id := range historyIDs {
-		ids = append(ids, id)
-		modes = append(modes, analyze.ModeHistory)
-	}
+	ids, modes := combinedIDsAndModes(staticIDs, historyIDs)
 
 	model, err := analyze.DecodeCombinedBinaryReports(raw.Bytes(), ids, modes)
 	if err != nil {
 		return fmt.Errorf("decode combined payload: %w", err)
 	}
 
+	model.Metadata = analyze.NewAnalysisMetadata(path)
+
 	rc.progressf(silent, progressWriter, "combined payload decoded")
 
 	startedAt = time.Now()
@@ -703,38 +751,65 @@ func (rc *RunCommand) renderCombinedDirect(
 
 func (rc *RunCommand) buildHistoryRunOptions(cmd *cobra.Command) HistoryRunOptions {
 	opts := HistoryRunOptions{
-		GCPercent:       rc.gogc,
-		BallastSize:     rc.ballastSize,
-		CPUProfile:      rc.cpuprofile,
-		HeapProfile:     rc.heapprofile,
-		Limit:           rc.limit,
-		FirstParent:     rc.firstParent,
-		Head:            rc.head,
-		Since:           rc.since,
-		Workers:         rc.workers,
-		BufferSize:      rc.bufferSize,
-		CommitBatchSize: rc.commitBatchSize,
-		BlobCacheSize:   rc.blobCacheSize,
-		DiffCacheSize:   rc.diffCacheSize,
-		BlobArenaSize:   rc.blobArenaSize,
-		MemoryBudget:    rc.memoryBudget,
-		CheckpointDir:   rc.checkpointDir,
-		ClearCheckpoint: rc.clearCheckpoint,
-		DebugTrace:      rc.debugTrace,
-		NDJSON:          rc.ndjson,
-		ConfigFile:      rc.configFile,
-		PlotOutput:      rc.plotOutput,
-		KeepStore:       rc.keepStore,
-		TmpDir:          rc.tmpDir,
+		GCPercent:           rc.gogc,
+		BallastSize:         rc.ballastSize,
+		CPUProfile:          rc.cpuprofile,
+		HeapProfile:         rc.heapprofile,
+		Limit:               rc.limit,
+		FirstParent:         rc.firstParent,
+		Head:                rc.head,
+		Since:               rc.since,
+		Workers:             rc.workers,
+		BufferSize:          rc.bufferSize,
+		CommitBatchSize:     rc.commitBatchSize,
+		BlobCacheSize:       rc.blobCacheSize,
+		DiffCacheSize:       rc.diffCacheSize,
+		BlobArenaSize:       rc.blobArenaSize,
+		MemoryBudget:        rc.memoryBudget,
+		MaxChangesPerCommit: rc.maxChangesPerCommit,
+		CheckpointDir:       rc.checkpointDir,
+		ClearCheckpoint:     rc.clearCheckpoint,
+		CacheDir:            rc.cacheDir,
+		NoCache:             rc.noCache,
+		DebugTrace:          rc.debugTrace,
+		NDJSON:              rc.ndjson,
+		ConfigFile:          rc.configFile,
+		PlotOutput:          rc.plotOutput,
+		KeepStore:           rc.keepStore,
+		TmpDir:              rc.tmpDir,
 	}
 
 	opts.Checkpoint = parseBoolFlag(cmd, "checkpoint")
 	opts.Resume = parseBoolFlag(cmd, "resume")
 	opts.AnalyzerFlags = collectAnalyzerFlags(cmd)
+	opts.AnalyzerFlags[plumbing.ConfigTreeDiffPathPolicy] = rc.buildPathPolicy()
 
 	return opts
 }
 
+// registerPersistenceFlags registers checkpoint and incremental cache flags.
+func (rc *RunCommand) registerPersistenceFlags(cmd *cobra.Command) {
+	cmd.Flags().Bool("checkpoint", true, "Enable checkpointing for crash recovery")
+	cmd.Flags().StringVar(&rc.checkpointDir, "checkpoint-dir", "",
+		"Checkpoint directory (default: ~/.codefang/checkpoints)")
+	cmd.Flags().Bool("resume", true, "Resume from checkpoint if available")
+	cmd.Flags().BoolVar(&rc.clearCheckpoint, "clear-checkpoint", false,
+		"Clear existing checkpoint before run")
+	cmd.Flags().StringVar(&rc.cacheDir, "cache-dir", "",
+		"Incremental analysis cache directory (skip already-processed commits)")
+	cmd.Flags().BoolVar(&rc.noCache, "no-cache", false,
+		"Force full re-analysis, overwriting any existing cache")
+}
+
+// resolveCacheDir returns the cache directory from opts, or empty when --no-cache is set.
+func resolveCacheDir(opts HistoryRunOptions) string {
+	if opts.NoCache || opts.CacheDir == "" {
+		return ""
+	}
+
+	return opts.CacheDir
+}
+
 // parseBoolFlag returns a pointer to the flag value if it was explicitly set, nil otherwise.
 func parseBoolFlag(cmd *cobra.Command, name string) *bool {
 	if !cmd.Flags().Changed(name) {
@@ -841,7 +916,7 @@ func (rc *RunCommand) printAnalyzerList(writer io.Writer, registry *analyze.Regi
 }
 
 func defaultRegistry() (*analyze.Registry, error) {
-	return analyze.NewRegistry(defaultStaticAnalyzers(), defaultHistoryLeaves())
+	return analyze.NewRegistry(defaultUASTAnalyzers(), defaultRawFileAnalyzers(), defaultHistoryLeaves())
 }
 
 func runStaticAnalyzers(
@@ -850,13 +925,23 @@ func runStaticAnalyzers(
 	format string,
 	verbose bool,
 	noColor bool,
+	perFile bool,
 	maxWorkers int,
 	memoryBudget int64,
+	languages []string,
+	pathPolicy pathpolicy.Options,
 	writer io.Writer,
 ) error {
-	service := analyze.NewStaticService(defaultStaticAnalyzers())
+	service := analyze.NewStaticService(defaultUASTAnalyzers(), defaultRawFileAnalyzers())
 	service.Renderer = renderer.NewDefaultStaticRenderer()
 	service.MaxWorkers = maxWorkers
+	service.PerFile = perFile
+	service.PathPolicy = pathPolicy
+
+	err := applyStaticLanguageFilter(service, languages)
+	if err != nil {
+		return err
+	}
 
 	applyStaticBudgetConfig(service, maxWorkers, memoryBudget)
 	applyStaticProgressLogging(service, verbose)
@@ -865,17 +950,24 @@ func runStaticAnalyzers(
 }
 
 // runStaticPlotAnalyzers runs static analysis and renders multi-page HTML plot output.
-// FRD: specs/frds/FRD-20260312-static-plot-multipage.md.
 func runStaticPlotAnalyzers(
 	path string,
 	analyzerIDs []string,
 	maxWorkers int,
 	memoryBudget int64,
+	languages []string,
+	pathPolicy pathpolicy.Options,
 	outputDir string,
 ) error {
-	service := analyze.NewStaticService(defaultStaticAnalyzers())
+	service := analyze.NewStaticService(defaultUASTAnalyzers(), defaultRawFileAnalyzers())
 	service.MaxWorkers = maxWorkers
 	service.AggregationMode = analyze.AggregationModeFull
+	service.PathPolicy = pathPolicy
+
+	err := applyStaticLanguageFilter(service, languages)
+	if err != nil {
+		return err
+	}
 
 	applyStaticBudgetConfig(service, maxWorkers, memoryBudget)
 	applyStaticProgressLogging(service, false)
@@ -895,7 +987,6 @@ func runStaticPlotAnalyzers(
 
 // applyStaticProgressLogging wires progress logging into the static service.
 // Default mode logs phase and file count. Verbose mode adds RSS and aggregator sizes.
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
 func applyStaticProgressLogging(service *analyze.StaticService, verbose bool) {
 	if verbose {
 		service.ProgressFunc = func(e analyze.StaticProgressEvent) {
@@ -914,9 +1005,73 @@ func applyStaticProgressLogging(service *analyze.StaticService, verbose bool) {
 	}
 }
 
+// registerExclusionFlags registers the three cross-phase path exclusion
+// flags.
+func (rc *RunCommand) registerExclusionFlags(cmd *cobra.Command) {
+	cmd.Flags().BoolVar(&rc.includeVendored, "include-vendored", false,
+		"Re-include vendored dependencies (detected by enry / Linguist) in analysis. "+
+			"Default: exclude vendor/, node_modules/, third_party/, testdata/, minified bundles, etc.")
+	cmd.Flags().BoolVar(&rc.includeGenerated, "include-generated", false,
+		"Re-include auto-generated files in analysis. "+
+			"Default: exclude *.pb.go, zz_generated_*.go, *_pb2.py, *.min.js, and any file whose "+
+			"first 512 bytes contain a generated-file marker (\"DO NOT EDIT\", \"Code generated\", etc.).")
+	cmd.Flags().StringSliceVar(&rc.extraExcludedPrefixes, "extra-excluded-prefixes", nil,
+		"Additional UNIX path prefixes to exclude on top of enry heuristics (e.g. "+
+			"\".venv/,target/,build/\"). Applies to both static and history phases.")
+}
+
+// buildPathPolicy constructs the cross-phase path exclusion policy from
+// the --include-vendored, --include-generated, and --extra-excluded-prefixes
+// flags.
+func (rc *RunCommand) buildPathPolicy() pathpolicy.Options {
+	return pathpolicy.Options{
+		IncludeVendored:       rc.includeVendored,
+		IncludeGenerated:      rc.includeGenerated,
+		ExtraExcludedPrefixes: rc.extraExcludedPrefixes,
+	}
+}
+
+// readLanguagesFlag extracts the --languages slice from the cobra command
+// when present. Returns nil for a nil command or when the flag is absent,
+// which keeps the caller path-safe in tests that construct a
+// RunCommand without wiring every cobra flag.
+func readLanguagesFlag(cmd *cobra.Command) []string {
+	if cmd == nil {
+		return nil
+	}
+
+	languages, err := cmd.Flags().GetStringSlice("languages")
+	if err != nil {
+		return nil
+	}
+
+	return languages
+}
+
+// applyStaticLanguageFilter derives libgit2-style basename globs from the
+// user's --languages value and assigns them to the static service. Empty
+// or "all" disables the filter (default behavior). An unknown language
+// token surfaces as an error so static-only runs fail fast — matching the
+// history-side semantics.
+func applyStaticLanguageFilter(service *analyze.StaticService, languages []string) error {
+	globs, wantsAll, err := langpath.Globs(languages)
+	if err != nil {
+		return fmt.Errorf("static --languages: %w", err)
+	}
+
+	if wantsAll {
+		service.LanguageGlobs = nil
+
+		return nil
+	}
+
+	service.LanguageGlobs = globs
+
+	return nil
+}
+
 // applyStaticBudgetConfig applies budget-derived parameters to the static service.
 // Explicit --static-workers overrides budget-derived MaxWorkers.
-// FRD: specs/frds/FRD-20260312-static-budget-tuning.md.
 func applyStaticBudgetConfig(service *analyze.StaticService, explicitWorkers int, memoryBudget int64) {
 	cfg := budget.SolveStaticBudget(memoryBudget)
 	if cfg.MaxWorkers == 0 {
@@ -1184,15 +1339,16 @@ func configureAndSelect(
 
 func buildConfigParams(opts HistoryRunOptions, fileCfg *cfgpkg.Config) framework.ConfigParams {
 	params := framework.ConfigParams{
-		Workers:         opts.Workers,
-		BufferSize:      opts.BufferSize,
-		CommitBatchSize: opts.CommitBatchSize,
-		BlobCacheSize:   opts.BlobCacheSize,
-		DiffCacheSize:   opts.DiffCacheSize,
-		BlobArenaSize:   opts.BlobArenaSize,
-		MemoryBudget:    opts.MemoryBudget,
-		GCPercent:       opts.GCPercent,
-		BallastSize:     opts.BallastSize,
+		Workers:             opts.Workers,
+		BufferSize:          opts.BufferSize,
+		CommitBatchSize:     opts.CommitBatchSize,
+		BlobCacheSize:       opts.BlobCacheSize,
+		DiffCacheSize:       opts.DiffCacheSize,
+		BlobArenaSize:       opts.BlobArenaSize,
+		MemoryBudget:        opts.MemoryBudget,
+		GCPercent:           opts.GCPercent,
+		BallastSize:         opts.BallastSize,
+		MaxChangesPerCommit: opts.MaxChangesPerCommit,
 	}
 
 	if fileCfg != nil {
@@ -1257,6 +1413,7 @@ func executeHistoryPipeline(
 	}
 
 	coordConfig.FirstParent = opts.FirstParent
+	coordConfig.TreeDiffPathspec = extractTreeDiffPathspec(pl.Core)
 
 	if !needsUAST(selectedLeaves) {
 		coordConfig.UASTPipelineWorkers = 0
@@ -1264,6 +1421,7 @@ func executeHistoryPipeline(
 
 	runner := framework.NewRunnerWithConfig(repository, path, coordConfig, allAnalyzers...)
 	runner.CoreCount = len(pl.Core)
+	runner.CacheDir = resolveCacheDir(opts)
 
 	red, analysisMetrics, metricsErr := createRunMetrics()
 	if metricsErr != nil {
@@ -1610,6 +1768,34 @@ func registerAnalyzerFlags(cobraCmd *cobra.Command) {
 			registerConfigFlag(cobraCmd, opt)
 		}
 	}
+
+	markDeprecatedExclusionFlags(cobraCmd)
+}
+
+// markDeprecatedExclusionFlags marks the legacy exclusion flags as
+// deprecated, directing users to the new cross-phase flags. Errors are
+// reported via the standard library logger because cobra returns a
+// deterministic error only when the flag does not exist — a programmer
+// mistake we want to surface during development, not silently swallow.
+func markDeprecatedExclusionFlags(cobraCmd *cobra.Command) {
+	const (
+		skipBlacklistFlag  = "skip-blacklist"
+		blacklistedPfxFlag = "blacklisted-prefixes"
+	)
+
+	err := cobraCmd.Flags().MarkDeprecated(skipBlacklistFlag,
+		"use --include-vendored=false and --include-generated=false "+
+			"(the new defaults). See CHANGELOG for migration.")
+	if err != nil {
+		log.Printf("warn: mark %q deprecated: %v", skipBlacklistFlag, err)
+	}
+
+	err = cobraCmd.Flags().MarkDeprecated(blacklistedPfxFlag,
+		"use --extra-excluded-prefixes; the old flag name is preserved "+
+			"for back-compat but will be removed in the next minor release.")
+	if err != nil {
+		log.Printf("warn: mark %q deprecated: %v", blacklistedPfxFlag, err)
+	}
 }
 
 func registerConfigFlag(cobraCmd *cobra.Command, opt pipeline.ConfigurationOption) {
@@ -1645,6 +1831,19 @@ type uastDependent interface {
 	NeedsUAST() bool
 }
 
+// extractTreeDiffPathspec returns the libgit2 pathspec pre-filter produced by
+// TreeDiffAnalyzer.Configure, or nil when no TreeDiffAnalyzer is present or
+// the user did not restrict by language.
+func extractTreeDiffPathspec(core []analyze.HistoryAnalyzer) []string {
+	for _, a := range core {
+		if td, ok := a.(*plumbing.TreeDiffAnalyzer); ok {
+			return td.Pathspec
+		}
+	}
+
+	return nil
+}
+
 func needsUAST(leaves []analyze.HistoryAnalyzer) bool {
 	for _, leaf := range leaves {
 		if ud, ok := leaf.(uastDependent); ok && ud.NeedsUAST() {
@@ -1890,7 +2089,7 @@ func defaultHistoryLeaves() []analyze.HistoryAnalyzer {
 	return result
 }
 
-func defaultStaticAnalyzers() []analyze.StaticAnalyzer {
+func defaultUASTAnalyzers() []analyze.StaticAnalyzer {
 	return []analyze.StaticAnalyzer{
 		clones.NewAnalyzer(),
 		complexity.NewAnalyzer(),
@@ -1901,6 +2100,12 @@ func defaultStaticAnalyzers() []analyze.StaticAnalyzer {
 	}
 }
 
+func defaultRawFileAnalyzers() []analyze.RawFileAnalyzer {
+	return []analyze.RawFileAnalyzer{
+		composition.NewAnalyzer(),
+	}
+}
+
 // validatePlotFlags checks that required flags are present when --format plot is used.
 // rebuildPlotIndex re-scans the output directory and generates a unified index.html
 // that includes pages from all phases (static + history).
diff --git a/cmd/codefang/commands/run_plot_test.go b/cmd/codefang/commands/run_plot_test.go
index a141a6c..e70bebb 100644
--- a/cmd/codefang/commands/run_plot_test.go
+++ b/cmd/codefang/commands/run_plot_test.go
@@ -1,7 +1,5 @@
 package commands
 
-// FRD: specs/frds/FRD-20260228-plot-through-store.md.
-
 import (
 	"context"
 	"io"
@@ -12,6 +10,7 @@ import (
 	"github.com/stretchr/testify/require"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
 )
 
 func TestRunCommand_ForwardsPlotOutputFlag(t *testing.T) {
@@ -20,7 +19,7 @@ func TestRunCommand_ForwardsPlotOutputFlag(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -50,7 +49,7 @@ func TestRunCommand_ForwardsKeepStoreFlag(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -125,13 +124,11 @@ func TestRenderFromStore_CreatesOutputDir(t *testing.T) {
 	require.NoError(t, statErr, "index.html should exist in nested output dir")
 }
 
-// FRD: specs/frds/FRD-20260312-static-plot-multipage.md.
-
 func TestStaticPlot_RequiresOutputFlag(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
diff --git a/cmd/codefang/commands/run_test.go b/cmd/codefang/commands/run_test.go
index f8e3d29..bb61364 100644
--- a/cmd/codefang/commands/run_test.go
+++ b/cmd/codefang/commands/run_test.go
@@ -22,6 +22,7 @@ import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/reportutil"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
 	"github.com/Sumatoshi-tech/codefang/internal/observability"
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 	"github.com/Sumatoshi-tech/codefang/pkg/pipeline"
@@ -114,7 +115,10 @@ func TestRunCommand_DispatchesBothModes(t *testing.T) {
 	)
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, writer io.Writer) error {
+		func(
+			_ string, ids []string, format string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, writer io.Writer,
+		) error {
 			staticCalled = true
 			staticFormat = format
 
@@ -149,7 +153,10 @@ func TestRunCommand_StaticOnly(t *testing.T) {
 	var historyCalled bool
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(
+			_ string, ids []string, _ string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
 			require.Equal(t, []string{"static/complexity"}, ids)
 
 			return nil
@@ -173,7 +180,10 @@ func TestRunCommand_ProgressOutput_DefaultEnabled(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(
+			_ string, ids []string, format string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
 			require.Equal(t, []string{"static/complexity"}, ids)
 			require.Equal(t, analyze.FormatJSON, format)
 
@@ -203,7 +213,7 @@ func TestRunCommand_ProgressOutput_Silent(t *testing.T) {
 	var historySilent bool
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static executor should not be called")
 
 			return nil
@@ -236,7 +246,7 @@ func TestRunCommand_ForwardsHistoryRuntimeOptions(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static executor should not be called")
 
 			return nil
@@ -268,7 +278,7 @@ func TestRunCommand_ForwardsCommitSelectionFlags(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -302,7 +312,7 @@ func TestRunCommand_ForwardsProfilingFlags(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -332,7 +342,7 @@ func TestRunCommand_ForwardsResourceTuningFlags(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -372,7 +382,7 @@ func TestRunCommand_ForwardsCheckpointFlags(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -408,7 +418,7 @@ func TestRunCommand_CheckpointDefaultsPreserved(t *testing.T) {
 	var seenOptions HistoryRunOptions
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
@@ -432,7 +442,10 @@ func TestRunCommand_ProgressOutput_Quiet(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(
+			_ string, ids []string, format string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
 			require.Equal(t, []string{"static/complexity"}, ids)
 			require.Equal(t, analyze.FormatJSON, format)
 
@@ -462,7 +475,9 @@ func TestRunCommand_UnknownAnalyzer(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error { return nil },
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
+			return nil
+		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
 			return nil
 		},
@@ -481,7 +496,10 @@ func TestRunCommand_GlobStaticAnalyzers(t *testing.T) {
 	var historyCalled bool
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(
+			_ string, ids []string, format string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
 			require.Equal(t, []string{"static/complexity"}, ids)
 			require.Equal(t, analyze.FormatJSON, format)
 
@@ -513,7 +531,10 @@ func TestRunCommand_GlobAllAnalyzers(t *testing.T) {
 	)
 
 	command := newRunCommandWithDeps(
-		func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, writer io.Writer) error {
+		func(
+			_ string, ids []string, format string, _ bool, _ bool, _ bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, writer io.Writer,
+		) error {
 			staticCalled = true
 			staticFormat = format
 
@@ -546,7 +567,9 @@ func TestRunCommand_GlobUnknownPattern(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error { return nil },
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
+			return nil
+		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
 			return nil
 		},
@@ -563,7 +586,9 @@ func TestRunCommand_GlobInvalidPattern(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error { return nil },
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
+			return nil
+		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
 			return nil
 		},
@@ -664,7 +689,7 @@ func TestRunCommand_ConvertInput_BinToJSON(t *testing.T) {
 	require.NoError(t, os.WriteFile(inputPath, raw.Bytes(), 0o600))
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static executor should not be called in conversion mode")
 
 			return nil
@@ -718,7 +743,7 @@ func TestRunCommand_ConvertInput_JSONToPlot(t *testing.T) {
 	require.NoError(t, os.WriteFile(inputPath, []byte(input), 0o600))
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static executor should not be called in conversion mode")
 
 			return nil
@@ -766,7 +791,7 @@ func TestRunCommand_ConvertInput_BinToPlot(t *testing.T) {
 	require.NoError(t, os.WriteFile(inputPath, raw.Bytes(), 0o600))
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static executor should not be called in conversion mode")
 
 			return nil
@@ -807,12 +832,12 @@ func TestRunCommand_MixedPlotRunsSeparatePhases(t *testing.T) {
 	outDir := t.TempDir()
 
 	command := newRunCommandWithAllDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			t.Fatal("static text executor should not be called for plot format")
 
 			return nil
 		},
-		func(_ string, ids []string, _ int, _ int64, dir string) error {
+		func(_ string, ids []string, _ int, _ int64, _ []string, _ pathpolicy.Options, dir string) error {
 			staticPlotCalled = true
 
 			require.Equal(t, []string{"static/complexity"}, ids)
@@ -863,7 +888,10 @@ func TestRunCommand_MixedUniversalFormatsRenderUnifiedModel(t *testing.T) {
 			)
 
 			command := newRunCommandWithDeps(
-				func(_ string, ids []string, format string, _ bool, _ bool, _ int, _ int64, writer io.Writer) error {
+				func(
+					_ string, ids []string, format string, _ bool, _ bool, _ bool,
+					_ int, _ int64, _ []string, _ pathpolicy.Options, writer io.Writer,
+				) error {
 					staticFormat = format
 
 					require.Equal(t, []string{"static/complexity"}, ids)
@@ -1084,7 +1112,7 @@ func TestRunCommand_DebugTraceFlag_Accepted(t *testing.T) {
 	t.Parallel()
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
@@ -1109,7 +1137,7 @@ func TestRunCommand_CreatesRootSpan(t *testing.T) {
 	t.Cleanup(func() { require.NoError(t, tp.Shutdown(context.Background())) })
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
@@ -1151,7 +1179,7 @@ func TestRunCommand_ShutdownCalledOnExit(t *testing.T) {
 	var shutdownCalled bool
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
@@ -1194,7 +1222,7 @@ func TestRunCommand_InitializesObservability(t *testing.T) {
 	}
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
@@ -1235,7 +1263,7 @@ func stubRunRegistry() (*analyze.Registry, error) {
 		},
 	}
 
-	return analyze.NewRegistry(staticAnalyzers, historyAnalyzers)
+	return analyze.NewRegistry(staticAnalyzers, nil, historyAnalyzers)
 }
 
 func noopObservabilityInit(_ observability.Config) (observability.Providers, error) {
@@ -1283,7 +1311,7 @@ func TestRunCommand_RootSpanAttributes(t *testing.T) {
 	t.Cleanup(func() { require.NoError(t, tp.Shutdown(context.Background())) })
 
 	command := newRunCommandWithDeps(
-		func(_ string, _ []string, _ string, _ bool, _ bool, _ int, _ int64, _ io.Writer) error {
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
 			return nil
 		},
 		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
@@ -1324,8 +1352,6 @@ func TestRunCommand_RootSpanAttributes(t *testing.T) {
 	require.Contains(t, rootAttrs, "codefang.duration_class", "root span should have duration_class")
 }
 
-// FRD: specs/frds/FRD-20260311-static-memory-limit.md.
-
 func TestParseMemoryBudgetBytes_Valid(t *testing.T) {
 	t.Parallel()
 
@@ -1364,12 +1390,10 @@ func TestApplyStaticMemoryLimit_SetsAndRestores(t *testing.T) {
 	restore()
 }
 
-// FRD: specs/frds/FRD-20260312-static-budget-tuning.md.
-
 func TestApplyStaticBudgetConfig_ZeroBudget(t *testing.T) {
 	t.Parallel()
 
-	service := analyze.NewStaticService(nil)
+	service := analyze.NewStaticService(nil, nil)
 	applyStaticBudgetConfig(service, 0, 0)
 
 	assert.Zero(t, service.MaxWorkers)
@@ -1381,7 +1405,7 @@ func TestApplyStaticBudgetConfig_WithBudget(t *testing.T) {
 
 	const budgetOneGiB int64 = 1024 * 1024 * 1024
 
-	service := analyze.NewStaticService(nil)
+	service := analyze.NewStaticService(nil, nil)
 	applyStaticBudgetConfig(service, 0, budgetOneGiB)
 
 	assert.Positive(t, service.MaxWorkers)
@@ -1395,7 +1419,7 @@ func TestApplyStaticBudgetConfig_ExplicitWorkersOverride(t *testing.T) {
 
 	const explicitWorkers = 2
 
-	service := analyze.NewStaticService(nil)
+	service := analyze.NewStaticService(nil, nil)
 	service.MaxWorkers = explicitWorkers
 	applyStaticBudgetConfig(service, explicitWorkers, budgetOneGiB)
 
@@ -1405,3 +1429,176 @@ func TestApplyStaticBudgetConfig_ExplicitWorkersOverride(t *testing.T) {
 	// Spill threshold should still be derived from budget.
 	assert.Positive(t, service.SpillThreshold)
 }
+
+func TestApplyStaticLanguageFilter_EmptyInput_DisablesFilter(t *testing.T) {
+	t.Parallel()
+
+	service := analyze.NewStaticService(nil, nil)
+
+	err := applyStaticLanguageFilter(service, nil)
+	require.NoError(t, err)
+	assert.Nil(t, service.LanguageGlobs,
+		"empty input must disable the filter (nil LanguageGlobs)")
+}
+
+func TestApplyStaticLanguageFilter_AllKeyword_DisablesFilter(t *testing.T) {
+	t.Parallel()
+
+	service := analyze.NewStaticService(nil, nil)
+
+	err := applyStaticLanguageFilter(service, []string{"all"})
+	require.NoError(t, err)
+	assert.Nil(t, service.LanguageGlobs,
+		"'all' sentinel must disable the filter")
+}
+
+func TestApplyStaticLanguageFilter_KnownLanguage_PopulatesGlobs(t *testing.T) {
+	t.Parallel()
+
+	service := analyze.NewStaticService(nil, nil)
+
+	err := applyStaticLanguageFilter(service, []string{"go"})
+	require.NoError(t, err)
+	assert.Contains(t, service.LanguageGlobs, "*.go")
+}
+
+func TestApplyStaticLanguageFilter_UnknownLanguage_FailsFast(t *testing.T) {
+	t.Parallel()
+
+	service := analyze.NewStaticService(nil, nil)
+
+	err := applyStaticLanguageFilter(service, []string{"notalang"})
+	require.Error(t, err)
+	assert.Contains(t, err.Error(), "notalang",
+		"unknown language must surface at configure time for static-only runs")
+}
+
+func TestRunCommand_PerFileFlag_Propagated(t *testing.T) {
+	t.Parallel()
+
+	var seenPerFile bool
+
+	command := newRunCommandWithDeps(
+		func(
+			_ string, _ []string, _ string, _ bool, _ bool, perFile bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
+			seenPerFile = perFile
+
+			return nil
+		},
+		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
+			return nil
+		},
+		stubRunRegistry,
+		noopObservabilityInit,
+	)
+
+	command.SetArgs([]string{"-a", "static/complexity", "--per-file"})
+	err := command.Execute()
+	require.NoError(t, err)
+	require.True(t, seenPerFile, "--per-file flag must be propagated to staticExecutor")
+}
+
+func TestRunCommand_PerFileFlag_ShortAlias(t *testing.T) {
+	t.Parallel()
+
+	var seenPerFile bool
+
+	command := newRunCommandWithDeps(
+		func(
+			_ string, _ []string, _ string, _ bool, _ bool, perFile bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
+			seenPerFile = perFile
+
+			return nil
+		},
+		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
+			return nil
+		},
+		stubRunRegistry,
+		noopObservabilityInit,
+	)
+
+	command.SetArgs([]string{"-a", "static/complexity", "-F"})
+	err := command.Execute()
+	require.NoError(t, err)
+	require.True(t, seenPerFile, "-F short alias must be propagated to staticExecutor")
+}
+
+func TestRunCommand_PerFileFlag_DefaultFalse(t *testing.T) {
+	t.Parallel()
+
+	var seenPerFile bool
+
+	command := newRunCommandWithDeps(
+		func(
+			_ string, _ []string, _ string, _ bool, _ bool, perFile bool,
+			_ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer,
+		) error {
+			seenPerFile = perFile
+
+			return nil
+		},
+		func(_ context.Context, _ string, _ []string, _ string, _ bool, _ HistoryRunOptions, _ io.Writer) error {
+			return nil
+		},
+		stubRunRegistry,
+		noopObservabilityInit,
+	)
+
+	command.SetArgs([]string{"-a", "static/complexity"})
+	err := command.Execute()
+	require.NoError(t, err)
+	require.False(t, seenPerFile, "per-file must be false by default")
+}
+
+func TestRunCommand_CacheDirFlag_Propagated(t *testing.T) {
+	t.Parallel()
+
+	var seenOpts HistoryRunOptions
+
+	command := newRunCommandWithDeps(
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
+			return nil
+		},
+		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
+			seenOpts = opts
+
+			return nil
+		},
+		stubRunRegistry,
+		noopObservabilityInit,
+	)
+
+	command.SetArgs([]string{"-a", "history/devs", "--cache-dir", "/tmp/test-cache"})
+	err := command.Execute()
+	require.NoError(t, err)
+	assert.Equal(t, "/tmp/test-cache", seenOpts.CacheDir)
+	assert.False(t, seenOpts.NoCache)
+}
+
+func TestRunCommand_NoCacheFlag_Propagated(t *testing.T) {
+	t.Parallel()
+
+	var seenOpts HistoryRunOptions
+
+	command := newRunCommandWithDeps(
+		func(_ string, _ []string, _ string, _ bool, _ bool, _ bool, _ int, _ int64, _ []string, _ pathpolicy.Options, _ io.Writer) error {
+			return nil
+		},
+		func(_ context.Context, _ string, _ []string, _ string, _ bool, opts HistoryRunOptions, _ io.Writer) error {
+			seenOpts = opts
+
+			return nil
+		},
+		stubRunRegistry,
+		noopObservabilityInit,
+	)
+
+	command.SetArgs([]string{"-a", "history/devs", "--cache-dir", "/tmp/cache", "--no-cache"})
+	err := command.Execute()
+	require.NoError(t, err)
+	assert.True(t, seenOpts.NoCache)
+}
diff --git a/cmd/uast/server_test.go b/cmd/uast/server_test.go
index e26cdd5..222aa8a 100644
--- a/cmd/uast/server_test.go
+++ b/cmd/uast/server_test.go
@@ -2,6 +2,7 @@ package main
 
 import (
 	"bytes"
+	"context"
 	"encoding/json"
 	"net/http"
 	"net/http/httptest"
@@ -66,7 +67,7 @@ string <- (string) => uast(
 	}
 
 	// Create test request.
-	req := httptest.NewRequest(http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
 	req.Header.Set("Content-Type", "application/json")
 
 	// Create response recorder.
@@ -130,7 +131,7 @@ func TestHandleParseWithoutCustomUASTMaps(t *testing.T) {
 	}
 
 	// Create test request.
-	req := httptest.NewRequest(http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
 	req.Header.Set("Content-Type", "application/json")
 
 	// Create response recorder.
@@ -193,7 +194,7 @@ func TestHandleParseWithInvalidUASTMaps(t *testing.T) {
 	}
 
 	// Create test request.
-	req := httptest.NewRequest(http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodPost, "/api/parse", bytes.NewBuffer(jsonData))
 	req.Header.Set("Content-Type", "application/json")
 
 	// Create response recorder.
@@ -230,7 +231,7 @@ func TestUASTServer_MiddlewareWrapsRoutes(t *testing.T) {
 	tracer := noop.NewTracerProvider().Tracer("test")
 	handler := newServerMux(tracer)
 
-	req := httptest.NewRequest(http.MethodGet, "/api/mappings", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/api/mappings", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	require.NotPanics(t, func() {
diff --git a/internal/analyzers/analyze/analyzer.go b/internal/analyzers/analyze/analyzer.go
index 0c4e578..7b7f605 100644
--- a/internal/analyzers/analyze/analyzer.go
+++ b/internal/analyzers/analyze/analyzer.go
@@ -85,11 +85,12 @@ type Analyzer interface {
 	Configure(facts map[string]any) error
 }
 
-// StaticAnalyzer interface defines the contract for UAST-based static analysis.
-type StaticAnalyzer interface {
+// FormattableAnalyzer is the shared contract for analyzers that produce
+// reportable output with thresholds, aggregation, and format methods.
+// Both StaticAnalyzer and RawFileAnalyzer satisfy this interface.
+type FormattableAnalyzer interface {
 	Analyzer
 
-	Analyze(root *node.Node) (Report, error)
 	Thresholds() Thresholds
 
 	// Aggregation methods.
@@ -103,6 +104,23 @@ type StaticAnalyzer interface {
 	FormatReportBinary(report Report, writer io.Writer) error
 }
 
+// StaticAnalyzer defines the contract for UAST-based static analysis.
+// Runs during the UAST phase on parsed AST nodes.
+type StaticAnalyzer interface {
+	FormattableAnalyzer
+
+	Analyze(root *node.Node) (Report, error)
+}
+
+// RawFileAnalyzer defines the contract for analyzers that operate on raw file
+// content (path + bytes) without UAST parsing. Runs during the raw-file phase
+// which walks ALL files in the directory tree (not just UAST-supported ones).
+type RawFileAnalyzer interface {
+	FormattableAnalyzer
+
+	AnalyzeFileContent(path string, content []byte) (Report, error)
+}
+
 // VisitorProvider enables single-pass traversal optimization.
 type VisitorProvider interface {
 	CreateVisitor() AnalysisVisitor
diff --git a/internal/analyzers/analyze/analyzer_test.go b/internal/analyzers/analyze/analyzer_test.go
index cda0e0a..42ec944 100644
--- a/internal/analyzers/analyze/analyzer_test.go
+++ b/internal/analyzers/analyze/analyzer_test.go
@@ -374,8 +374,6 @@ func TestRunAnalyzers_Parallel(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260303-data-extraction-guard.md.
-
 func TestReportFunctionListWithFallback_PrimaryKeyFound(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/analyze/budget_static_test.go b/internal/analyzers/analyze/budget_static_test.go
index 52a6273..3872d3a 100644
--- a/internal/analyzers/analyze/budget_static_test.go
+++ b/internal/analyzers/analyze/budget_static_test.go
@@ -2,8 +2,6 @@
 
 package analyze_test
 
-// FRD: specs/frds/FRD-20260312-static-budget-integration-test.md.
-
 import (
 	"context"
 	"runtime/debug"
@@ -46,7 +44,7 @@ func TestStaticAnalyzers_MemoryBudget(t *testing.T) {
 
 	dir := setupHeavyBenchDir(t, budgetTestFileCount, budgetTestFunctionsPerFile)
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.NativeMemoryReleaseFn = func() {} // Skip real malloc_trim in test.
 
 	// Apply budget-derived parameters.
diff --git a/internal/analyzers/analyze/commits_by_tick_test.go b/internal/analyzers/analyze/commits_by_tick_test.go
index db8e207..9bbd9fe 100644
--- a/internal/analyzers/analyze/commits_by_tick_test.go
+++ b/internal/analyzers/analyze/commits_by_tick_test.go
@@ -9,8 +9,6 @@ import (
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 )
 
-// FRD: specs/frds/FRD-20260302-build-commits-by-tick.md.
-
 // testTickData is a minimal tick data type for testing BuildCommitsByTick.
 type testTickData struct {
 	Commits map[string]int
diff --git a/internal/analyzers/analyze/conversion.go b/internal/analyzers/analyze/conversion.go
index ec54e11..860f951 100644
--- a/internal/analyzers/analyze/conversion.go
+++ b/internal/analyzers/analyze/conversion.go
@@ -24,15 +24,17 @@ var ErrInvalidUnifiedModel = errors.New("invalid unified model")
 
 // AnalyzerResult represents one analyzer report in canonical converted output.
 type AnalyzerResult struct {
-	ID     string       `json:"id"     yaml:"id"`
-	Mode   AnalyzerMode `json:"mode"   yaml:"mode"`
-	Report Report       `json:"report" yaml:"report"`
+	ID     string         `json:"id"               yaml:"id"`
+	Mode   AnalyzerMode   `json:"mode"             yaml:"mode"`
+	Schema AnalyzerSchema `json:"schema,omitempty" yaml:"schema,omitempty"`
+	Report Report         `json:"report"           yaml:"report"`
 }
 
 // UnifiedModel is the canonical intermediate model for run output conversion.
 type UnifiedModel struct {
-	Version   string           `json:"version"   yaml:"version"`
-	Analyzers []AnalyzerResult `json:"analyzers" yaml:"analyzers"`
+	Version   string            `json:"version"            yaml:"version"`
+	Metadata  *AnalysisMetadata `json:"metadata,omitempty" yaml:"metadata,omitempty"`
+	Analyzers []AnalyzerResult  `json:"analyzers"          yaml:"analyzers"`
 }
 
 // Validate ensures canonical model constraints are satisfied.
@@ -219,6 +221,7 @@ func DecodeCombinedBinaryReports(input []byte, ids []string, modes []AnalyzerMod
 		results[i] = AnalyzerResult{
 			ID:     ids[i],
 			Mode:   modes[i],
+			Schema: SchemaForAnalyzer(ids[i]),
 			Report: report,
 		}
 	}
@@ -321,6 +324,8 @@ func WriteConvertedOutput(model UnifiedModel, outputFormat string, writer io.Wri
 		return writeConvertedTimeSeries(model, FormatTimeSeries, writer)
 	case FormatTimeSeriesNDJSON:
 		return writeConvertedTimeSeries(model, FormatTimeSeriesNDJSON, writer)
+	case FormatNDJSON:
+		return writeConvertedNDJSON(model, writer)
 	case FormatPlot:
 		if plotRendererFn == nil {
 			return fmt.Errorf("%w: plot renderer not registered", ErrUnsupportedFormat)
@@ -332,6 +337,33 @@ func WriteConvertedOutput(model UnifiedModel, outputFormat string, writer io.Wri
 	}
 }
 
+// writeConvertedNDJSON writes one compact JSON line per analyzer result.
+// If metadata is present, a metadata line is written first.
+func writeConvertedNDJSON(model UnifiedModel, writer io.Writer) error {
+	encoder := json.NewEncoder(writer)
+
+	if model.Metadata != nil {
+		metaLine := map[string]any{
+			"version":  model.Version,
+			"metadata": model.Metadata,
+		}
+
+		err := encoder.Encode(metaLine)
+		if err != nil {
+			return fmt.Errorf("encode ndjson metadata: %w", err)
+		}
+	}
+
+	for _, result := range model.Analyzers {
+		err := encoder.Encode(result)
+		if err != nil {
+			return fmt.Errorf("encode ndjson analyzer %s: %w", result.ID, err)
+		}
+	}
+
+	return nil
+}
+
 // writeConvertedTimeSeries builds merged timeseries from a unified model's
 // history reports and writes the result to the writer.
 func writeConvertedTimeSeries(model UnifiedModel, format string, writer io.Writer) error {
diff --git a/internal/analyzers/analyze/conversion_ndjson_test.go b/internal/analyzers/analyze/conversion_ndjson_test.go
new file mode 100644
index 0000000..4f237ee
--- /dev/null
+++ b/internal/analyzers/analyze/conversion_ndjson_test.go
@@ -0,0 +1,83 @@
+package analyze_test
+
+import (
+	"bytes"
+	"encoding/json"
+	"strings"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+func TestWriteConvertedOutput_NDJSON_OneLinePerAnalyzer(t *testing.T) {
+	t.Parallel()
+
+	model := analyze.UnifiedModel{
+		Version: analyze.UnifiedModelVersion,
+		Analyzers: []analyze.AnalyzerResult{
+			{ID: "static/complexity", Mode: analyze.ModeStatic, Report: analyze.Report{"total": 10}},
+			{ID: "history/sentiment", Mode: analyze.ModeHistory, Report: analyze.Report{"score": 0.8}},
+		},
+	}
+
+	var buf bytes.Buffer
+
+	err := analyze.WriteConvertedOutput(model, analyze.FormatNDJSON, &buf)
+	require.NoError(t, err)
+
+	lines := strings.Split(strings.TrimSpace(buf.String()), "\n")
+	require.Len(t, lines, 2)
+
+	var line1 map[string]any
+	require.NoError(t, json.Unmarshal([]byte(lines[0]), &line1))
+	assert.Equal(t, "static/complexity", line1["id"])
+	assert.Equal(t, "static", line1["mode"])
+
+	var line2 map[string]any
+	require.NoError(t, json.Unmarshal([]byte(lines[1]), &line2))
+	assert.Equal(t, "history/sentiment", line2["id"])
+}
+
+func TestWriteConvertedOutput_NDJSON_EmptyAnalyzers(t *testing.T) {
+	t.Parallel()
+
+	model := analyze.UnifiedModel{
+		Version:   analyze.UnifiedModelVersion,
+		Analyzers: nil,
+	}
+
+	var buf bytes.Buffer
+
+	err := analyze.WriteConvertedOutput(model, analyze.FormatNDJSON, &buf)
+	require.NoError(t, err)
+
+	assert.Empty(t, strings.TrimSpace(buf.String()))
+}
+
+func TestWriteConvertedOutput_NDJSON_WithMetadata(t *testing.T) {
+	t.Parallel()
+
+	model := analyze.UnifiedModel{
+		Version:  analyze.UnifiedModelVersion,
+		Metadata: analyze.NewAnalysisMetadata("/repo/test"),
+		Analyzers: []analyze.AnalyzerResult{
+			{ID: "static/test", Mode: analyze.ModeStatic, Report: analyze.Report{}},
+		},
+	}
+
+	var buf bytes.Buffer
+
+	err := analyze.WriteConvertedOutput(model, analyze.FormatNDJSON, &buf)
+	require.NoError(t, err)
+
+	lines := strings.Split(strings.TrimSpace(buf.String()), "\n")
+	require.Len(t, lines, 2) // Metadata line + 1 analyzer line.
+
+	var metaLine map[string]any
+	require.NoError(t, json.Unmarshal([]byte(lines[0]), &metaLine))
+	assert.Equal(t, analyze.UnifiedModelVersion, metaLine["version"])
+	assert.NotNil(t, metaLine["metadata"])
+}
diff --git a/internal/analyzers/analyze/export_test.go b/internal/analyzers/analyze/export_test.go
new file mode 100644
index 0000000..6bf93b0
--- /dev/null
+++ b/internal/analyzers/analyze/export_test.go
@@ -0,0 +1,7 @@
+package analyze
+
+// LanguageGlobMatcher exposes matchesLanguageGlobs for black-box tests
+// in the analyze_test package.
+func LanguageGlobMatcher(name string, globs []string) bool {
+	return matchesLanguageGlobs(name, globs)
+}
diff --git a/internal/analyzers/analyze/metadata.go b/internal/analyzers/analyze/metadata.go
new file mode 100644
index 0000000..25840da
--- /dev/null
+++ b/internal/analyzers/analyze/metadata.go
@@ -0,0 +1,26 @@
+package analyze
+
+import (
+	"path/filepath"
+	"time"
+
+	"github.com/Sumatoshi-tech/codefang/pkg/version"
+)
+
+// AnalysisMetadata holds provenance information for a codefang run.
+type AnalysisMetadata struct {
+	RepoPath        string `json:"repo_path"        yaml:"repo_path"`
+	RepoName        string `json:"repo_name"        yaml:"repo_name"`
+	AnalyzedAt      string `json:"analyzed_at"      yaml:"analyzed_at"`
+	CodefangVersion string `json:"codefang_version" yaml:"codefang_version"`
+}
+
+// NewAnalysisMetadata creates metadata for the given repository path.
+func NewAnalysisMetadata(repoPath string) *AnalysisMetadata {
+	return &AnalysisMetadata{
+		RepoPath:        repoPath,
+		RepoName:        filepath.Base(repoPath),
+		AnalyzedAt:      time.Now().UTC().Format(time.RFC3339),
+		CodefangVersion: version.Version,
+	}
+}
diff --git a/internal/analyzers/analyze/metadata_test.go b/internal/analyzers/analyze/metadata_test.go
new file mode 100644
index 0000000..a6638d0
--- /dev/null
+++ b/internal/analyzers/analyze/metadata_test.go
@@ -0,0 +1,75 @@
+package analyze_test
+
+import (
+	"encoding/json"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+const testRepoPath = "/home/user/sources/kubernetes"
+
+func TestNewAnalysisMetadata_RepoName(t *testing.T) {
+	t.Parallel()
+
+	meta := analyze.NewAnalysisMetadata(testRepoPath)
+
+	assert.Equal(t, "kubernetes", meta.RepoName)
+}
+
+func TestNewAnalysisMetadata_RepoPath(t *testing.T) {
+	t.Parallel()
+
+	meta := analyze.NewAnalysisMetadata(testRepoPath)
+
+	assert.Equal(t, testRepoPath, meta.RepoPath)
+}
+
+func TestNewAnalysisMetadata_AnalyzedAt(t *testing.T) {
+	t.Parallel()
+
+	before := time.Now()
+	meta := analyze.NewAnalysisMetadata(testRepoPath)
+	after := time.Now()
+
+	parsed, err := time.Parse(time.RFC3339, meta.AnalyzedAt)
+	require.NoError(t, err)
+	assert.False(t, parsed.Before(before.Truncate(time.Second)))
+	assert.False(t, parsed.After(after.Add(time.Second)))
+}
+
+func TestNewAnalysisMetadata_Version(t *testing.T) {
+	t.Parallel()
+
+	meta := analyze.NewAnalysisMetadata(testRepoPath)
+
+	assert.NotEmpty(t, meta.CodefangVersion)
+}
+
+func TestUnifiedModel_MetadataInJSON(t *testing.T) {
+	t.Parallel()
+
+	model := analyze.UnifiedModel{
+		Version:  analyze.UnifiedModelVersion,
+		Metadata: analyze.NewAnalysisMetadata(testRepoPath),
+		Analyzers: []analyze.AnalyzerResult{
+			{ID: "static/test", Mode: analyze.ModeStatic, Report: analyze.Report{}},
+		},
+	}
+
+	data, err := json.Marshal(model)
+	require.NoError(t, err)
+
+	var parsed map[string]any
+	require.NoError(t, json.Unmarshal(data, &parsed))
+
+	meta, ok := parsed["metadata"].(map[string]any)
+	require.True(t, ok, "metadata section must exist in JSON")
+	assert.Equal(t, "kubernetes", meta["repo_name"])
+	assert.NotEmpty(t, meta["analyzed_at"])
+	assert.NotEmpty(t, meta["codefang_version"])
+}
diff --git a/internal/analyzers/analyze/metrics_safe_test.go b/internal/analyzers/analyze/metrics_safe_test.go
index 62a3e3e..60b758b 100644
--- a/internal/analyzers/analyze/metrics_safe_test.go
+++ b/internal/analyzers/analyze/metrics_safe_test.go
@@ -8,8 +8,6 @@ import (
 	"github.com/stretchr/testify/require"
 )
 
-// FRD: specs/frds/FRD-20260302-compute-metrics-safe.md.
-
 // testMetrics is a minimal metrics type for testing SafeMetricComputer.
 type testMetrics struct {
 	Value int
diff --git a/internal/analyzers/analyze/perfile.go b/internal/analyzers/analyze/perfile.go
new file mode 100644
index 0000000..35e4ef6
--- /dev/null
+++ b/internal/analyzers/analyze/perfile.go
@@ -0,0 +1,79 @@
+package analyze
+
+import "path/filepath"
+
+// PerFileModeEnabled is implemented by aggregators that support per-file report retention.
+// StaticService uses this to enable per-file mode and extract results after analysis.
+type PerFileModeEnabled interface {
+	SetPerFileMode(enabled bool)
+	PerFileResults() map[string]Report
+}
+
+// PerFileResults returns per-file reports collected during the last AnalyzeFolder call.
+// Returns nil when PerFile is false or no files were analyzed.
+// Keyed by analyzer name → file path → per-file report.
+func (svc *StaticService) PerFileResults() map[string]map[string]Report {
+	return svc.perFileResults
+}
+
+// extractPerFileResults collects per-file reports from all aggregators that support it.
+func extractPerFileResults(aggregators map[string]ResultAggregator) map[string]map[string]Report {
+	result := make(map[string]map[string]Report, len(aggregators))
+
+	for name, agg := range aggregators {
+		pfm, ok := agg.(PerFileModeEnabled)
+		if !ok {
+			continue
+		}
+
+		fileReports := pfm.PerFileResults()
+		if len(fileReports) > 0 {
+			result[name] = fileReports
+		}
+	}
+
+	if len(result) == 0 {
+		return nil
+	}
+
+	return result
+}
+
+// enrichWithPerFileData takes the base JSON report and injects per-file data into each section.
+// It uses the PerFileEnricher interface to avoid import cycles with the renderer package.
+// Returns the enriched report (same reference if type assertion succeeds, original otherwise).
+func (svc *StaticService) enrichWithPerFileData(report any, _ []ReportSection) any {
+	enricher, ok := report.(PerFileEnricher)
+	if !ok {
+		return report
+	}
+
+	enricher.EnrichWithPerFileData(svc.PerFileResults(), svc.analysisRootPath, svc.allFormattable())
+
+	return report
+}
+
+// PerFileEnricher is implemented by JSON report types that support per-file data injection.
+// The renderer.JSONReport implements this to avoid import cycles.
+type PerFileEnricher interface {
+	EnrichWithPerFileData(
+		perFileResults map[string]map[string]Report,
+		rootPath string,
+		analyzers []FormattableAnalyzer,
+	)
+}
+
+// MakeRelativePath converts an absolute file path to be relative to rootPath.
+// Returns the original path if it cannot be made relative.
+func MakeRelativePath(filePath, rootPath string) string {
+	if rootPath == "" {
+		return filePath
+	}
+
+	rel, err := filepath.Rel(rootPath, filePath)
+	if err != nil {
+		return filePath
+	}
+
+	return rel
+}
diff --git a/internal/analyzers/analyze/record_reader_test.go b/internal/analyzers/analyze/record_reader_test.go
index 390983c..fb88c8e 100644
--- a/internal/analyzers/analyze/record_reader_test.go
+++ b/internal/analyzers/analyze/record_reader_test.go
@@ -1,7 +1,5 @@
 package analyze
 
-// FRD: specs/frds/FRD-20260302-record-reader.md.
-
 import (
 	"encoding/gob"
 	"testing"
diff --git a/internal/analyzers/analyze/record_writer_test.go b/internal/analyzers/analyze/record_writer_test.go
index 49a2d69..f960ad5 100644
--- a/internal/analyzers/analyze/record_writer_test.go
+++ b/internal/analyzers/analyze/record_writer_test.go
@@ -1,7 +1,5 @@
 package analyze
 
-// FRD: specs/frds/FRD-20260303-write-slice-kind.md.
-
 import (
 	"encoding/gob"
 	"errors"
diff --git a/internal/analyzers/analyze/registry.go b/internal/analyzers/analyze/registry.go
index b3f4440..192adbc 100644
--- a/internal/analyzers/analyze/registry.go
+++ b/internal/analyzers/analyze/registry.go
@@ -44,15 +44,21 @@ var ErrInvalidAnalyzerMode = errors.New("invalid analyzer mode")
 var ErrInvalidAnalyzerGlob = errors.New("invalid analyzer glob")
 
 // NewRegistry creates a registry from analyzer descriptors.
-func NewRegistry(static []StaticAnalyzer, history []HistoryAnalyzer) (*Registry, error) {
-	ordered := make([]Descriptor, 0, len(static)+len(history))
-	index := make(map[string]Descriptor, len(static)+len(history))
+func NewRegistry(static []StaticAnalyzer, raw []RawFileAnalyzer, history []HistoryAnalyzer) (*Registry, error) {
+	totalCap := len(static) + len(raw) + len(history)
+	ordered := make([]Descriptor, 0, totalCap)
+	index := make(map[string]Descriptor, totalCap)
 
 	err := appendDescriptors(ModeStatic, static, index, &ordered)
 	if err != nil {
 		return nil, err
 	}
 
+	err = appendDescriptors(ModeStatic, raw, index, &ordered)
+	if err != nil {
+		return nil, err
+	}
+
 	err = appendDescriptors(ModeHistory, history, index, &ordered)
 	if err != nil {
 		return nil, err
diff --git a/internal/analyzers/analyze/registry_test.go b/internal/analyzers/analyze/registry_test.go
index 1753ac2..1ad88c0 100644
--- a/internal/analyzers/analyze/registry_test.go
+++ b/internal/analyzers/analyze/registry_test.go
@@ -88,7 +88,7 @@ func (s *stubHistoryAnalyzer) ReportFromTICKs(_ context.Context, _ []analyze.TIC
 func TestRegistry_AllStableOrder(t *testing.T) {
 	t.Parallel()
 
-	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), defaultHistoryForRegistryTest())
+	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), nil, defaultHistoryForRegistryTest())
 	if err != nil {
 		t.Fatalf("unexpected registry creation error: %v", err)
 	}
@@ -110,7 +110,7 @@ func TestRegistry_AllStableOrder(t *testing.T) {
 func TestRegistry_IDsByMode(t *testing.T) {
 	t.Parallel()
 
-	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), defaultHistoryForRegistryTest())
+	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), nil, defaultHistoryForRegistryTest())
 	if err != nil {
 		t.Fatalf("unexpected registry creation error: %v", err)
 	}
@@ -130,7 +130,7 @@ func TestRegistry_IDsByMode(t *testing.T) {
 func TestRegistry_Split(t *testing.T) {
 	t.Parallel()
 
-	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), defaultHistoryForRegistryTest())
+	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), nil, defaultHistoryForRegistryTest())
 	if err != nil {
 		t.Fatalf("unexpected registry creation error: %v", err)
 	}
@@ -152,7 +152,7 @@ func TestRegistry_Split(t *testing.T) {
 func TestRegistry_SplitUnknown(t *testing.T) {
 	t.Parallel()
 
-	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), defaultHistoryForRegistryTest())
+	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), nil, defaultHistoryForRegistryTest())
 	if err != nil {
 		t.Fatalf("unexpected registry creation error: %v", err)
 	}
@@ -164,7 +164,7 @@ func TestRegistry_SplitUnknown(t *testing.T) {
 }
 
 // complexityID is a stable fixture for the first registered static analyzer.
-// Used by ExpandPatterns tests — FRD: specs/frds/FRD-20260306-append-unique-ids-removal.md.
+// Used by ExpandPatterns tests.
 const complexityID = "static/complexity"
 
 func TestRegistry_ExpandPatterns_ExactMatch(t *testing.T) {
@@ -303,7 +303,7 @@ func TestRegistry_SelectedIDs_WithPatterns(t *testing.T) {
 func newTestRegistry(t *testing.T) *analyze.Registry {
 	t.Helper()
 
-	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), defaultHistoryForRegistryTest())
+	registry, err := analyze.NewRegistry(defaultStaticForRegistryTest(), nil, defaultHistoryForRegistryTest())
 	if err != nil {
 		t.Fatalf("failed to create registry: %v", err)
 	}
diff --git a/internal/analyzers/analyze/schema_registry.go b/internal/analyzers/analyze/schema_registry.go
new file mode 100644
index 0000000..ac97789
--- /dev/null
+++ b/internal/analyzers/analyze/schema_registry.go
@@ -0,0 +1,126 @@
+package analyze
+
+// FieldMeta describes a single field in an analyzer's output schema.
+type FieldMeta struct {
+	Type        string `json:"type"                  yaml:"type"`
+	Grain       string `json:"grain,omitempty"       yaml:"grain,omitempty"`
+	Description string `json:"description,omitempty" yaml:"description,omitempty"`
+}
+
+// AnalyzerSchema maps output field names to their metadata.
+type AnalyzerSchema map[string]FieldMeta
+
+// SchemaForAnalyzer returns the output schema for the given analyzer ID,
+// or nil if the analyzer is not registered.
+func SchemaForAnalyzer(analyzerID string) AnalyzerSchema {
+	schema, ok := analyzerSchemas[analyzerID]
+	if !ok {
+		return nil
+	}
+
+	return schema
+}
+
+// analyzerSchemas is the static registry of output schemas for all analyzers.
+var analyzerSchemas = map[string]AnalyzerSchema{
+	"static/complexity": {
+		"function_complexity": {Type: "list", Grain: "function", Description: "Per-function cyclomatic and cognitive complexity"},
+		"distribution":        {Type: "aggregate", Description: "Complexity distribution (simple/moderate/complex)"},
+		"high_risk_functions": {Type: "risk", Grain: "function", Description: "Functions exceeding complexity thresholds"},
+		"aggregate":           {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"static/halstead": {
+		"function_halstead":     {Type: "list", Grain: "function", Description: "Per-function Halstead volume, effort, and bugs"},
+		"distribution":          {Type: "aggregate", Description: "Effort distribution (low/medium/high/very_high)"},
+		"high_effort_functions": {Type: "risk", Grain: "function", Description: "Functions with high Halstead effort"},
+		"aggregate":             {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"static/cohesion": {
+		"function_cohesion":      {Type: "list", Grain: "function", Description: "Per-function LCOM cohesion score"},
+		"distribution":           {Type: "aggregate", Description: "Cohesion distribution"},
+		"low_cohesion_functions": {Type: "risk", Grain: "function", Description: "Functions with poor cohesion"},
+		"aggregate":              {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"static/comments": {
+		"comment_quality":        {Type: "list", Grain: "comment", Description: "Per-comment quality assessment"},
+		"function_documentation": {Type: "list", Grain: "function", Description: "Per-function documentation status"},
+		"undocumented_functions": {Type: "risk", Grain: "function", Description: "Functions lacking documentation"},
+		"aggregate":              {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"static/clones": {
+		"clone_pairs":             {Type: "list", Grain: "pair", Description: "Detected clone pairs with similarity"},
+		"clone_type_distribution": {Type: "aggregate", Description: "Clone type breakdown (Type-1/2/3)"},
+		"total_functions":         {Type: "scalar", Description: "Total functions analyzed"},
+		"total_clone_pairs":       {Type: "scalar", Description: "Total clone pairs (uncapped)"},
+		"clone_ratio":             {Type: "scalar", Description: "Fraction of functions involved in duplication"},
+	},
+	"static/imports": {
+		"import_list":  {Type: "list", Grain: "import", Description: "All import statements"},
+		"dependencies": {Type: "list", Grain: "dependency", Description: "External dependencies with risk"},
+		"categories":   {Type: "aggregate", Description: "Import category breakdown"},
+		"aggregate":    {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"static/composition": {
+		"breakdown":   {Type: "aggregate", Description: "File count per category"},
+		"percentages": {Type: "aggregate", Description: "Percentage per category"},
+		"total_files": {Type: "scalar", Description: "Total files analyzed"},
+	},
+	"history/sentiment": {
+		"time_series":           {Type: "time_series", Grain: "tick", Description: "Per-tick sentiment scores"},
+		"trend":                 {Type: "aggregate", Description: "Sentiment trend direction"},
+		"low_sentiment_periods": {Type: "risk", Grain: "tick", Description: "Ticks with negative sentiment"},
+		"aggregate":             {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/anomaly": {
+		"anomalies":   {Type: "list", Grain: "tick", Description: "Detected anomalous ticks"},
+		"time_series": {Type: "time_series", Grain: "tick", Description: "Per-tick anomaly metrics and z-scores"},
+		"aggregate":   {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/devs": {
+		"developers": {Type: "list", Grain: "developer", Description: "Per-developer contribution statistics"},
+		"languages":  {Type: "list", Grain: "language", Description: "Per-language contribution breakdown"},
+		"busfactor":  {Type: "list", Grain: "language", Description: "Bus factor per language"},
+		"activity":   {Type: "time_series", Grain: "tick", Description: "Per-tick commit activity by developer"},
+		"churn":      {Type: "time_series", Grain: "tick", Description: "Per-tick lines added/removed"},
+		"aggregate":  {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/file-history": {
+		"file_churn":        {Type: "list", Grain: "file", Description: "Per-file change frequency and contributors"},
+		"file_contributors": {Type: "list", Grain: "file", Description: "Per-file contributor breakdown"},
+		"hotspots":          {Type: "risk", Grain: "file", Description: "High-churn files"},
+		"composition":       {Type: "aggregate", Description: "File type composition"},
+		"composition_ts":    {Type: "time_series", Grain: "tick", Description: "File composition over time"},
+		"aggregate":         {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/couples": {
+		"file_coupling":      {Type: "list", Grain: "pair", Description: "Co-changed file pairs"},
+		"developer_coupling": {Type: "list", Grain: "pair", Description: "Developer collaboration pairs"},
+		"file_ownership":     {Type: "list", Grain: "file", Description: "Per-file ownership"},
+		"aggregate":          {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/shotness": {
+		"node_hotness":  {Type: "list", Grain: "node", Description: "AST node change frequency"},
+		"node_coupling": {Type: "list", Grain: "pair", Description: "Co-changed AST node pairs"},
+		"hotspot_nodes": {Type: "risk", Grain: "node", Description: "Frequently changed nodes"},
+		"aggregate":     {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/burndown": {
+		"global_survival":    {Type: "time_series", Grain: "sample", Description: "Global code survival curve"},
+		"file_survival":      {Type: "list", Grain: "file", Description: "Per-file survival data"},
+		"developer_survival": {Type: "list", Grain: "developer", Description: "Per-developer survival data"},
+		"aggregate":          {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/quality": {
+		"time_series": {Type: "time_series", Grain: "tick", Description: "Per-tick code quality metrics"},
+		"aggregate":   {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/imports": {
+		"import_list":  {Type: "list", Grain: "import", Description: "Import statements (requires UAST mode)"},
+		"dependencies": {Type: "list", Grain: "dependency", Description: "Dependencies (requires UAST mode)"},
+		"categories":   {Type: "aggregate", Description: "Import category breakdown"},
+		"aggregate":    {Type: "aggregate", Description: "Summary statistics"},
+	},
+	"history/typos": {
+		"typos": {Type: "list", Grain: "identifier", Description: "Detected identifier typos (requires UAST mode)"},
+	},
+}
diff --git a/internal/analyzers/analyze/schema_registry_test.go b/internal/analyzers/analyze/schema_registry_test.go
new file mode 100644
index 0000000..00d344c
--- /dev/null
+++ b/internal/analyzers/analyze/schema_registry_test.go
@@ -0,0 +1,66 @@
+package analyze_test
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+const (
+	testAnalyzerComplexity = "static/complexity"
+	testAnalyzerSentiment  = "history/sentiment"
+	testFieldFunctions     = "function_complexity"
+	testFieldTimeSeries    = "time_series"
+)
+
+func TestSchemaForAnalyzer_Known(t *testing.T) {
+	t.Parallel()
+
+	schema := analyze.SchemaForAnalyzer(testAnalyzerComplexity)
+
+	require.NotNil(t, schema)
+	assert.Contains(t, schema, testFieldFunctions)
+	assert.Equal(t, "list", schema[testFieldFunctions].Type)
+	assert.Equal(t, "function", schema[testFieldFunctions].Grain)
+}
+
+func TestSchemaForAnalyzer_HistoryAnalyzer(t *testing.T) {
+	t.Parallel()
+
+	schema := analyze.SchemaForAnalyzer(testAnalyzerSentiment)
+
+	require.NotNil(t, schema)
+	assert.Contains(t, schema, testFieldTimeSeries)
+	assert.Equal(t, "time_series", schema[testFieldTimeSeries].Type)
+	assert.Equal(t, "tick", schema[testFieldTimeSeries].Grain)
+}
+
+func TestSchemaForAnalyzer_Unknown(t *testing.T) {
+	t.Parallel()
+
+	schema := analyze.SchemaForAnalyzer("unknown/analyzer")
+
+	assert.Nil(t, schema)
+}
+
+func TestSchemaForAnalyzer_AllRegistered(t *testing.T) {
+	t.Parallel()
+
+	knownIDs := []string{
+		"static/complexity", "static/halstead", "static/cohesion",
+		"static/comments", "static/clones", "static/imports",
+		"static/composition",
+		"history/sentiment", "history/anomaly", "history/devs",
+		"history/file-history", "history/couples", "history/shotness",
+		"history/burndown", "history/quality", "history/imports",
+		"history/typos",
+	}
+
+	for _, id := range knownIDs {
+		schema := analyze.SchemaForAnalyzer(id)
+		assert.NotNilf(t, schema, "schema missing for %s", id)
+	}
+}
diff --git a/internal/analyzers/analyze/static.go b/internal/analyzers/analyze/static.go
index 80c0d6b..f7ecd17 100644
--- a/internal/analyzers/analyze/static.go
+++ b/internal/analyzers/analyze/static.go
@@ -15,9 +15,12 @@ import (
 	"sync/atomic"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/plotpage"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
+	"github.com/Sumatoshi-tech/codefang/internal/storage"
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 	"github.com/Sumatoshi-tech/codefang/pkg/meminfo"
 	"github.com/Sumatoshi-tech/codefang/pkg/pipeline"
+	"github.com/Sumatoshi-tech/codefang/pkg/textutil"
 	"github.com/Sumatoshi-tech/codefang/pkg/uast"
 	"github.com/Sumatoshi-tech/codefang/pkg/uast/pkg/node"
 )
@@ -69,7 +72,8 @@ type StaticRenderer interface {
 
 // StaticService provides a high-level interface for running static analysis.
 type StaticService struct {
-	Analyzers []StaticAnalyzer
+	UASTAnalyzers    []StaticAnalyzer
+	RawFileAnalyzers []RawFileAnalyzer
 
 	// MaxWorkers limits the number of concurrent file analysis goroutines.
 	// Zero means use min(runtime.NumCPU(), DefaultStaticMaxWorkers).
@@ -102,11 +106,51 @@ type StaticService struct {
 	// Renderer provides section-based output rendering.
 	// Must be set before calling FormatJSON, FormatText, FormatCompact, or RunAndFormat.
 	Renderer StaticRenderer
+
+	// PerFile enables per-file report retention in aggregators.
+	// When true, aggregators store per-file snapshots accessible via PerFileResults.
+	PerFile bool
+
+	// LanguageGlobs restricts the directory walk to files whose basename
+	// matches any of the given fnmatch-style globs (e.g. "*.go",
+	// "Dockerfile"). Built from --languages via langpath.Globs. Empty or
+	// nil disables the filter — default behavior.
+	LanguageGlobs []string
+
+	// PathPolicy carries vendor / generated / extra-prefix exclusion
+	// rules shared across phases. The zero value excludes
+	// enry.IsVendor and pathfilter-detected generated files by
+	// default.
+	PathPolicy pathpolicy.Options
+
+	// perFileResults is populated after AnalyzeFolder when PerFile is true.
+	// Keyed by analyzer name → file path → per-file report.
+	perFileResults map[string]map[string]Report
+
+	// analysisRootPath is the root path used in the last AnalyzeFolder call.
+	// Used by FormatJSON to make per-file paths relative.
+	analysisRootPath string
 }
 
 // NewStaticService creates a StaticService with the given analyzers.
-func NewStaticService(analyzers []StaticAnalyzer) *StaticService {
-	return &StaticService{Analyzers: analyzers}
+func NewStaticService(uastAnalyzers []StaticAnalyzer, rawAnalyzers []RawFileAnalyzer) *StaticService {
+	return &StaticService{UASTAnalyzers: uastAnalyzers, RawFileAnalyzers: rawAnalyzers}
+}
+
+// allFormattable returns a merged, deterministically-ordered slice of all analyzers
+// that satisfy FormattableAnalyzer (UAST first, then raw-file).
+func (svc *StaticService) allFormattable() []FormattableAnalyzer {
+	result := make([]FormattableAnalyzer, 0, len(svc.UASTAnalyzers)+len(svc.RawFileAnalyzers))
+
+	for _, a := range svc.UASTAnalyzers {
+		result = append(result, a)
+	}
+
+	for _, a := range svc.RawFileAnalyzers {
+		result = append(result, a)
+	}
+
+	return result
 }
 
 // ResolveMaxWorkers returns the effective worker count for parallel file analysis.
@@ -177,44 +221,158 @@ func (svc *StaticService) emitProgress(
 // Workers block naturally when the buffer is full, providing backpressure.
 const streamFilesBufSize = 100
 
+// analysisPipelineState threads shared state through pipeline phases.
+type analysisPipelineState struct {
+	rootPath       string
+	analyzersToRun []string
+	aggregators    map[string]ResultAggregator
+}
+
 // AnalyzeFolder runs static analyzers for supported files in a folder tree.
-// File discovery streams paths to workers via a channel, providing natural backpressure.
+// Executes raw-file and UAST phases sequentially via pipeline.RunPhases.
 func (svc *StaticService) AnalyzeFolder(ctx context.Context, rootPath string, analyzerList []string) (map[string]Report, error) {
-	analyzersToRun := svc.resolveAnalyzerList(analyzerList)
-	aggregators := svc.initAggregators(analyzersToRun)
+	svc.analysisRootPath = rootPath
 
 	ctx, cancel := context.WithCancel(ctx)
 	defer cancel()
 
+	state := analysisPipelineState{
+		rootPath:       rootPath,
+		analyzersToRun: svc.resolveAnalyzerList(analyzerList),
+	}
+	state.aggregators = svc.initAggregators(state.analyzersToRun)
+
+	state, err := pipeline.RunPhases(ctx, state,
+		pipeline.PhaseFunc[analysisPipelineState](svc.rawFilePhase),
+		pipeline.PhaseFunc[analysisPipelineState](svc.uastPhase),
+	)
+	if err != nil {
+		return nil, err
+	}
+
+	results := buildFinalResults(state.aggregators)
+
+	if svc.PerFile {
+		svc.perFileResults = extractPerFileResults(state.aggregators)
+	}
+
+	return results, nil
+}
+
+// rawFilePhase walks ALL files and runs RawFileAnalyzers on file headers.
+func (svc *StaticService) rawFilePhase(ctx context.Context, state analysisPipelineState) (analysisPipelineState, error) {
+	if len(svc.RawFileAnalyzers) == 0 {
+		return state, nil
+	}
+
+	// Filter to only requested raw-file analyzers.
+	rawNames := svc.requestedRawFileAnalyzers(state.analyzersToRun)
+	if len(rawNames) == 0 {
+		return state, nil
+	}
+
+	var mu sync.Mutex
+
+	walkErr := filepath.WalkDir(state.rootPath, func(path string, entry os.DirEntry, err error) error {
+		if ctx.Err() != nil {
+			return ctx.Err()
+		}
+
+		skip, skipErr := skipAllFilesEntry(entry, err)
+		if skip || skipErr != nil {
+			return skipErr
+		}
+
+		if !matchesLanguageGlobs(path, svc.LanguageGlobs) {
+			return nil
+		}
+
+		if pathpolicy.Exclude(path, nil, svc.PathPolicy) {
+			return nil
+		}
+
+		classifyFile(path, rawNames, state.aggregators, &mu, state.rootPath)
+
+		return nil
+	})
+	if walkErr != nil {
+		return state, fmt.Errorf("raw-file phase walk %s: %w", state.rootPath, walkErr)
+	}
+
+	return state, nil
+}
+
+// requestedRawFileAnalyzers returns RawFileAnalyzers whose names appear in the requested list.
+func (svc *StaticService) requestedRawFileAnalyzers(requested []string) []RawFileAnalyzer {
+	nameSet := make(map[string]struct{}, len(requested))
+	for _, n := range requested {
+		nameSet[n] = struct{}{}
+	}
+
+	var result []RawFileAnalyzer
+
+	for _, a := range svc.RawFileAnalyzers {
+		if _, ok := nameSet[a.Name()]; ok {
+			result = append(result, a)
+		}
+	}
+
+	return result
+}
+
+// uastPhase streams UAST-supported files and runs StaticAnalyzers in parallel.
+func (svc *StaticService) uastPhase(ctx context.Context, state analysisPipelineState) (analysisPipelineState, error) {
+	uastNames := svc.requestedUASTAnalyzers(state.analyzersToRun)
+	if len(uastNames) == 0 {
+		return state, nil
+	}
+
 	var fileCounter atomic.Int64
 
 	fileCh := make(chan string, streamFilesBufSize)
 	walkErrCh := make(chan error, 1)
 
 	go func() {
-		walkErrCh <- svc.streamFiles(ctx, rootPath, fileCh)
+		walkErrCh <- svc.streamFiles(ctx, state.rootPath, fileCh)
 	}()
 
-	poolErr := svc.analyzeFilesParallel(ctx, fileCh, analyzersToRun, aggregators, &fileCounter)
+	poolErr := svc.analyzeFilesParallel(ctx, fileCh, uastNames, state.aggregators, &fileCounter, state.rootPath)
 
 	walkErr := <-walkErrCh
 
 	if poolErr != nil {
-		return nil, poolErr
+		return state, poolErr
 	}
 
 	if walkErr != nil {
-		return nil, walkErr
+		return state, walkErr
 	}
 
-	results := buildFinalResults(aggregators)
+	svc.emitProgress(fileCounter.Load(), state.aggregators, ProgressPhaseComplete)
 
-	svc.emitProgress(fileCounter.Load(), aggregators, ProgressPhaseComplete)
+	return state, nil
+}
 
-	return results, nil
+// requestedUASTAnalyzers returns names of UAST analyzers that appear in the requested list.
+func (svc *StaticService) requestedUASTAnalyzers(requested []string) []string {
+	nameSet := make(map[string]struct{}, len(svc.UASTAnalyzers))
+	for _, a := range svc.UASTAnalyzers {
+		nameSet[a.Name()] = struct{}{}
+	}
+
+	result := make([]string, 0, len(requested))
+
+	for _, name := range requested {
+		if _, ok := nameSet[name]; ok {
+			result = append(result, name)
+		}
+	}
+
+	return result
 }
 
-// streamFiles walks the directory tree and sends supported file paths on fileCh.
+// runUASTAnalysis runs UAST-based analyzers with file streaming and parallel parsing.
+// streamFiles walks the directory tree and sends UAST-supported file paths on fileCh.
 // The channel is closed when the walk completes. Returns walk errors.
 func (svc *StaticService) streamFiles(ctx context.Context, rootPath string, fileCh chan<- string) error {
 	defer close(fileCh)
@@ -234,6 +392,14 @@ func (svc *StaticService) streamFiles(ctx context.Context, rootPath string, file
 			return skipErr
 		}
 
+		if !matchesLanguageGlobs(path, svc.LanguageGlobs) {
+			return nil
+		}
+
+		if pathpolicy.Exclude(path, nil, svc.PathPolicy) {
+			return nil
+		}
+
 		select {
 		case fileCh <- path:
 		case <-ctx.Done():
@@ -266,6 +432,7 @@ func (svc *StaticService) analyzeFilesParallel(
 	analyzersToRun []string,
 	aggregators map[string]ResultAggregator,
 	fileCounter *atomic.Int64,
+	rootPath string,
 ) error {
 	var mu sync.Mutex
 
@@ -294,7 +461,8 @@ func (svc *StaticService) analyzeFilesParallel(
 				return analyzeErr
 			}
 
-			StampSourceFile(reportMap, filePath)
+			StampSourceFile(reportMap, filePath, rootPath)
+			StampLanguage(reportMap, parser.GetLanguage(filePath))
 
 			mu.Lock()
 			aggregateFolderAnalysis(reportMap, aggregators)
@@ -334,18 +502,51 @@ func acquireParser(ch chan *uast.Parser) (*uast.Parser, error) {
 }
 
 // StampSourceFile adds "_source_file" metadata to every collection item in each report.
-// This allows downstream consumers (e.g., plot generators) to group results by file/package.
+// Also sets SourceFileKey at the report top level for analyzers without collections (e.g., imports).
+// This allows downstream consumers (e.g., plot generators, per-file retention) to group results by file.
 // Handles both legacy []map[string]any collections and TypedCollection wrappers.
-func StampSourceFile(reports map[string]Report, filePath string) {
+// When rootPath is non-empty, the stamped path is made relative to it.
+func StampSourceFile(reports map[string]Report, filePath, rootPath string) {
+	stamped := MakeRelativePath(filePath, rootPath)
+	dir := filepath.Dir(stamped)
+
 	for _, report := range reports {
+		report[SourceFileKey] = stamped
+		report[DirectoryKey] = dir
+
 		for key, val := range report {
 			switch v := val.(type) {
 			case TypedCollection:
-				v.SourceFile = filePath
+				v.SourceFile = stamped
+				v.Directory = dir
 				report[key] = v
 			case []map[string]any:
 				for _, item := range v {
-					item[SourceFileKey] = filePath
+					item[SourceFileKey] = stamped
+					item[DirectoryKey] = dir
+				}
+			}
+		}
+	}
+}
+
+// StampLanguage adds "_language" metadata to every collection item in each report.
+func StampLanguage(reports map[string]Report, language string) {
+	if language == "" {
+		return
+	}
+
+	for _, report := range reports {
+		report[LanguageKey] = language
+
+		for key, val := range report {
+			switch v := val.(type) {
+			case TypedCollection:
+				v.Language = language
+				report[key] = v
+			case []map[string]any:
+				for _, item := range v {
+					item[LanguageKey] = language
 				}
 			}
 		}
@@ -393,22 +594,101 @@ func (svc *StaticService) analyzeFile(
 		return nil, fmt.Errorf("read %s: %w", path, err)
 	}
 
-	uastNode, err := parser.Parse(ctx, path, content)
-	if err != nil {
-		return nil, fmt.Errorf("parse %s: %w", path, err)
+	uastNode, parseErr := parser.Parse(ctx, path, content)
+	if parseErr != nil {
+		return nil, fmt.Errorf("parse %s: %w", path, parseErr)
 	}
 
-	results, err := svc.runAnalyzers(ctx, uastNode, analyzersToRun)
+	results, runErr := svc.runAnalyzers(ctx, uastNode, analyzersToRun)
 
 	node.ReleaseTree(uastNode)
 
-	if err != nil {
-		return nil, fmt.Errorf("run analyzers for %s: %w", path, err)
+	if runErr != nil {
+		return nil, fmt.Errorf("run analyzers for %s: %w", path, runErr)
 	}
 
 	return results, nil
 }
 
+// contentHeaderSize is the max bytes read per file in the all-files pre-pass.
+// Enry needs only a prefix for binary/language detection.
+const contentHeaderSize = 8192
+
+// skipAllFilesEntry decides if a walk entry should be skipped in the raw-file phase.
+func skipAllFilesEntry(entry os.DirEntry, walkErr error) (bool, error) {
+	if walkErr != nil {
+		if errors.Is(walkErr, fs.ErrPermission) || errors.Is(walkErr, fs.ErrNotExist) {
+			if entry != nil && entry.IsDir() {
+				return true, filepath.SkipDir
+			}
+
+			return true, nil
+		}
+
+		return false, walkErr
+	}
+
+	if entry == nil {
+		return true, nil
+	}
+
+	if entry.IsDir() {
+		if entry.Name() == ".git" {
+			return true, filepath.SkipDir
+		}
+
+		return true, nil
+	}
+
+	return false, nil
+}
+
+// classifyFile runs raw-file analyzers on a single file and aggregates results.
+func classifyFile(
+	path string,
+	analyzers []RawFileAnalyzer,
+	aggregators map[string]ResultAggregator,
+	mu *sync.Mutex,
+	rootPath string,
+) {
+	header := readFileHeader(path, contentHeaderSize)
+
+	for _, a := range analyzers {
+		report, analyzeErr := a.AnalyzeFileContent(path, header)
+		if analyzeErr != nil {
+			continue
+		}
+
+		report[SourceFileKey] = MakeRelativePath(path, rootPath)
+
+		mu.Lock()
+
+		if agg, ok := aggregators[a.Name()]; ok {
+			agg.Aggregate(map[string]Report{a.Name(): report})
+		}
+
+		mu.Unlock()
+	}
+}
+
+// readFileHeader reads up to limit bytes from a file. Returns nil on error.
+func readFileHeader(path string, limit int) []byte {
+	f, err := os.Open(path)
+	if err != nil {
+		return nil
+	}
+	defer f.Close()
+
+	buf := make([]byte, limit)
+
+	n, readErr := f.Read(buf)
+	if readErr != nil && !errors.Is(readErr, io.EOF) {
+		return nil
+	}
+
+	return buf[:n]
+}
+
 func aggregateFolderAnalysis(results map[string]Report, aggregators map[string]ResultAggregator) {
 	for analyzerName, aggregator := range aggregators {
 		report, found := results[analyzerName]
@@ -425,9 +705,10 @@ func (svc *StaticService) resolveAnalyzerList(analyzerList []string) []string {
 		return analyzerList
 	}
 
-	names := make([]string, 0, len(svc.Analyzers))
+	all := svc.allFormattable()
+	names := make([]string, 0, len(all))
 
-	for _, analyzer := range svc.Analyzers {
+	for _, analyzer := range all {
 		names = append(names, analyzer.Name())
 	}
 
@@ -436,10 +717,11 @@ func (svc *StaticService) resolveAnalyzerList(analyzerList []string) []string {
 
 func (svc *StaticService) initAggregators(analyzersToRun []string) map[string]ResultAggregator {
 	aggregators := make(map[string]ResultAggregator)
+	byName := svc.analyzersByName()
 
 	for _, analyzerName := range analyzersToRun {
-		analyzer := svc.FindAnalyzer(analyzerName)
-		if analyzer == nil {
+		analyzer, found := byName[analyzerName]
+		if !found {
 			continue
 		}
 
@@ -453,6 +735,10 @@ func (svc *StaticService) initAggregators(analyzersToRun []string) map[string]Re
 			setter.SetSpillThreshold(svc.SpillThreshold)
 		}
 
+		if pf, ok := agg.(PerFileModeEnabled); svc.PerFile && ok {
+			pf.SetPerFileMode(true)
+		}
+
 		aggregators[analyzerName] = agg
 	}
 
@@ -473,7 +759,7 @@ func buildFinalResults(aggregators map[string]ResultAggregator) map[string]Repor
 func (svc *StaticService) BuildSections(results map[string]Report) []ReportSection {
 	sections := make([]ReportSection, 0, len(results))
 
-	for _, currentAnalyzer := range svc.Analyzers {
+	for _, currentAnalyzer := range svc.allFormattable() {
 		report, found := results[currentAnalyzer.Name()]
 		if !found {
 			continue
@@ -488,30 +774,34 @@ func (svc *StaticService) BuildSections(results map[string]Report) []ReportSecti
 }
 
 func (svc *StaticService) runAnalyzers(ctx context.Context, uastNode *node.Node, analyzerList []string) (map[string]Report, error) {
-	factory := NewFactory(svc.Analyzers)
+	factory := NewFactory(svc.UASTAnalyzers)
 
 	return factory.RunAnalyzers(ctx, uastNode, analyzerList)
 }
 
-// FindAnalyzer finds an analyzer by name.
-func (svc *StaticService) FindAnalyzer(name string) StaticAnalyzer {
-	for _, analyzer := range svc.Analyzers {
-		if analyzer.Name() == name {
-			return analyzer
-		}
+// analyzersByName builds a name-to-analyzer lookup map from all formattable analyzers.
+func (svc *StaticService) analyzersByName() map[string]FormattableAnalyzer {
+	all := svc.allFormattable()
+	result := make(map[string]FormattableAnalyzer, len(all))
+
+	for _, a := range all {
+		result[a.Name()] = a
 	}
 
-	return nil
+	return result
 }
 
 // AnalyzerNamesByID resolves analyzer descriptor IDs to internal names.
 func (svc *StaticService) AnalyzerNamesByID(ids []string) ([]string, error) {
-	idToName := make(map[string]string, len(svc.Analyzers))
-	for _, analyzer := range svc.Analyzers {
+	all := svc.allFormattable()
+	idToName := make(map[string]string, len(all))
+
+	for _, analyzer := range all {
 		idToName[analyzer.Descriptor().ID] = analyzer.Name()
 	}
 
 	names := make([]string, 0, len(ids))
+
 	for _, id := range ids {
 		name, ok := idToName[id]
 		if !ok {
@@ -533,6 +823,10 @@ func (svc *StaticService) FormatJSON(results map[string]Report, writer io.Writer
 	sections := svc.BuildSections(results)
 	report := svc.Renderer.SectionsToJSON(sections)
 
+	if svc.PerFile {
+		report = svc.enrichWithPerFileData(report, sections)
+	}
+
 	encoder := json.NewEncoder(writer)
 	encoder.SetIndent("", "  ")
 
@@ -574,6 +868,7 @@ func (svc *StaticService) FormatPerAnalyzer(
 	writer io.Writer,
 ) error {
 	isFirst := true
+	byName := svc.analyzersByName()
 
 	for _, analyzerName := range analyzerNames {
 		report, ok := results[analyzerName]
@@ -581,8 +876,8 @@ func (svc *StaticService) FormatPerAnalyzer(
 			continue
 		}
 
-		analyzer := svc.FindAnalyzer(analyzerName)
-		if analyzer == nil {
+		analyzer, found := byName[analyzerName]
+		if !found {
 			return fmt.Errorf("%w: %s", ErrUnknownAnalyzerID, analyzerName)
 		}
 
@@ -644,6 +939,7 @@ func (svc *StaticService) RenderPlotPages(
 	}
 
 	pages := make([]plotpage.PageMeta, 0, len(analyzerNames))
+	byName := svc.analyzersByName()
 
 	for _, name := range analyzerNames {
 		report, ok := results[name]
@@ -651,8 +947,8 @@ func (svc *StaticService) RenderPlotPages(
 			continue
 		}
 
-		analyzer := svc.FindAnalyzer(name)
-		if analyzer == nil {
+		analyzer, found := byName[name]
+		if !found {
 			continue
 		}
 
@@ -684,9 +980,15 @@ func (svc *StaticService) RenderPlotPages(
 	return pages, nil
 }
 
+// reportJSONFilename is the name of the machine-readable JSON report emitted alongside plot pages.
+const reportJSONFilename = "report.json"
+
+// reportJSONPerm is the file permission for report.json.
+const reportJSONPerm = 0o640
+
 // FormatPlotPages renders multi-page HTML plot output to outputDir.
 // Each analyzer gets its own HTML page plus an index page with navigation.
-// FRD: specs/frds/FRD-20260312-static-plot-multipage.md.
+// Also emits report.json with the raw analysis results for external dashboards.
 func (svc *StaticService) FormatPlotPages(
 	analyzerNames []string,
 	results map[string]Report,
@@ -697,13 +999,27 @@ func (svc *StaticService) FormatPlotPages(
 		return err
 	}
 
-	renderer := &plotpage.MultiPageRenderer{
+	mpRenderer := &plotpage.MultiPageRenderer{
 		OutputDir: outputDir,
 		Title:     plotPageTitle,
 		Theme:     plotpage.ThemeDark,
 	}
 
-	return renderer.RenderIndex(pages)
+	indexErr := mpRenderer.RenderIndex(pages)
+	if indexErr != nil {
+		return indexErr
+	}
+
+	return writeReportJSON(results, outputDir)
+}
+
+// writeReportJSON writes the analysis results as indented JSON to outputDir/report.json.
+func writeReportJSON(results map[string]Report, outputDir string) error {
+	reportPath := filepath.Join(outputDir, reportJSONFilename)
+
+	return storage.WriteAtomic(reportPath, reportJSONPerm, func(w io.Writer) error {
+		return textutil.WriteJSON(w, results, true)
+	})
 }
 
 // ResolveAggregationMode returns the aggregation mode for a given output format.
diff --git a/internal/analyzers/analyze/static_bench_test.go b/internal/analyzers/analyze/static_bench_test.go
index 43c3ceb..bf84aa0 100644
--- a/internal/analyzers/analyze/static_bench_test.go
+++ b/internal/analyzers/analyze/static_bench_test.go
@@ -1,11 +1,5 @@
 package analyze_test
 
-// FRD: specs/frds/FRD-20260311-cap-static-workers.md.
-// FRD: specs/frds/FRD-20260311-static-malloc-trim.md.
-// FRD: specs/frds/FRD-20260311-static-memory-limit.md.
-// FRD: specs/frds/FRD-20260311-bounded-parser-pool.md.
-// FRD: specs/frds/FRD-20260311-eager-tree-release.md.
-
 import (
 	"context"
 	"fmt"
@@ -176,7 +170,7 @@ func BenchmarkStaticPeakParsers(b *testing.B) {
 	dir := setupHeavyBenchDir(b, benchPeakFileCount, benchPeakFunctionsPerFile)
 
 	b.Run("before-uncapped", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = runtime.NumCPU()
 		svc.MallocTrimInterval = -1
 
@@ -198,7 +192,7 @@ func BenchmarkStaticPeakParsers(b *testing.B) {
 	})
 
 	b.Run("after-capped", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchCappedWorkers
 		svc.MallocTrimInterval = -1
 
@@ -226,7 +220,7 @@ func BenchmarkStaticMallocTrim(b *testing.B) {
 	dir := setupHeavyBenchDir(b, benchMallocTrimFileCount, benchMallocTrimFunctionsPerFile)
 
 	b.Run("before-no-trim", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchCappedWorkers
 		svc.MallocTrimInterval = -1
 
@@ -249,7 +243,7 @@ func BenchmarkStaticMallocTrim(b *testing.B) {
 	})
 
 	b.Run("after-trim-enabled", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchCappedWorkers
 		svc.MallocTrimInterval = benchMallocTrimInterval
 		// NativeMemoryReleaseFn=nil uses real gitlib.ReleaseNativeMemory().
@@ -279,7 +273,7 @@ func BenchmarkStaticMemoryLimit(b *testing.B) {
 	dir := setupHeavyBenchDir(b, benchMemLimitFileCount, benchMemLimitFunctionsPerFile)
 
 	b.Run("before-no-limit", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchMemLimitWorkers
 		svc.MallocTrimInterval = -1
 
@@ -301,7 +295,7 @@ func BenchmarkStaticMemoryLimit(b *testing.B) {
 	})
 
 	b.Run("after-with-limit", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchMemLimitWorkers
 		svc.MallocTrimInterval = -1
 
@@ -333,7 +327,7 @@ func BenchmarkStaticParserPool(b *testing.B) {
 	dir := setupHeavyBenchDir(b, benchParserPoolFileCount, benchParserPoolFunctionsPerFile)
 
 	b.Run("before-workers-NumCPU", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = runtime.NumCPU()
 		svc.MallocTrimInterval = -1
 
@@ -350,7 +344,7 @@ func BenchmarkStaticParserPool(b *testing.B) {
 	})
 
 	b.Run("after-workers-4", func(b *testing.B) {
-		svc := analyze.NewStaticService(testStaticAnalyzers())
+		svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 		svc.MaxWorkers = benchCappedWorkers
 		svc.MallocTrimInterval = -1
 
@@ -379,7 +373,7 @@ func TestStaticPeakParsers_BoundedConcurrency(t *testing.T) {
 
 	dir := setupHeavyBenchDir(t, fileCount, benchPeakFunctionsPerFile)
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = maxWorkers
 
 	require.Equal(t, maxWorkers, svc.ResolveMaxWorkers())
diff --git a/internal/analyzers/analyze/static_language.go b/internal/analyzers/analyze/static_language.go
new file mode 100644
index 0000000..b347b99
--- /dev/null
+++ b/internal/analyzers/analyze/static_language.go
@@ -0,0 +1,24 @@
+package analyze
+
+// Static-side --languages filter.
+
+import "path/filepath"
+
+// matchesLanguageGlobs reports whether name's basename matches any of
+// the given fnmatch-style globs. An empty or nil globs slice disables
+// filtering and returns true.
+func matchesLanguageGlobs(name string, globs []string) bool {
+	if len(globs) == 0 {
+		return true
+	}
+
+	base := filepath.Base(name)
+	for _, g := range globs {
+		ok, err := filepath.Match(g, base)
+		if err == nil && ok {
+			return true
+		}
+	}
+
+	return false
+}
diff --git a/internal/analyzers/analyze/static_language_test.go b/internal/analyzers/analyze/static_language_test.go
new file mode 100644
index 0000000..fadd281
--- /dev/null
+++ b/internal/analyzers/analyze/static_language_test.go
@@ -0,0 +1,195 @@
+package analyze_test
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/composition"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
+)
+
+func TestStaticService_AnalyzeFolder_PathPolicy_DefaultsDropVendorAndGenerated(t *testing.T) {
+	t.Parallel()
+
+	tmpDir := t.TempDir()
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "keep.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.MkdirAll(filepath.Join(tmpDir, "vendor", "lib"), 0o750))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "vendor", "lib", "vendored.go"),
+			[]byte("package lib\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "api.pb.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+
+	composer := composition.NewAnalyzer()
+	svc := analyze.NewStaticService(nil, []analyze.RawFileAnalyzer{composer})
+
+	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{composer.Name()})
+	require.NoError(t, err)
+
+	report := results[composer.Name()]
+	assert.EqualValues(t, 1, report["total_files"],
+		"default path policy must drop vendor/lib/vendored.go and api.pb.go")
+}
+
+func TestStaticService_AnalyzeFolder_PathPolicy_IncludeVendoredAndGeneratedRestoresAll(t *testing.T) {
+	t.Parallel()
+
+	tmpDir := t.TempDir()
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "keep.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.MkdirAll(filepath.Join(tmpDir, "vendor", "lib"), 0o750))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "vendor", "lib", "vendored.go"),
+			[]byte("package lib\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "api.pb.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+
+	composer := composition.NewAnalyzer()
+	svc := analyze.NewStaticService(nil, []analyze.RawFileAnalyzer{composer})
+	svc.PathPolicy = pathpolicy.Options{
+		IncludeVendored:  true,
+		IncludeGenerated: true,
+	}
+
+	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{composer.Name()})
+	require.NoError(t, err)
+
+	report := results[composer.Name()]
+	assert.EqualValues(t, 3, report["total_files"],
+		"include-vendored + include-generated must restore today's default behavior")
+}
+
+func TestStaticService_AnalyzeFolder_NilLanguageGlobs_ProcessesAllSupportedFiles(t *testing.T) {
+	t.Parallel()
+
+	tmpDir := t.TempDir()
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "a.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "b.py"),
+			[]byte("def f():\n    pass\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "c.js"),
+			[]byte("function f() {}\n"), 0o600))
+
+	composer := composition.NewAnalyzer()
+	svc := analyze.NewStaticService(nil, []analyze.RawFileAnalyzer{composer})
+
+	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{composer.Name()})
+	require.NoError(t, err)
+
+	report := results[composer.Name()]
+	assert.EqualValues(t, 3, report["total_files"],
+		"nil LanguageGlobs must preserve today's behavior: all 3 files processed")
+}
+
+func TestStaticService_AnalyzeFolder_LanguageGlobs_FiltersRawFileWalk(t *testing.T) {
+	t.Parallel()
+
+	tmpDir := t.TempDir()
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "keep.go"),
+			[]byte("package main\nfunc F() {}\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "drop.py"),
+			[]byte("def f():\n    pass\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "drop.js"),
+			[]byte("function f() {}\n"), 0o600))
+
+	composer := composition.NewAnalyzer()
+	svc := analyze.NewStaticService(nil, []analyze.RawFileAnalyzer{composer})
+	svc.LanguageGlobs = []string{"*.go"}
+
+	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{composer.Name()})
+	require.NoError(t, err)
+	require.Contains(t, results, composer.Name())
+
+	report := results[composer.Name()]
+
+	assert.EqualValues(t, 1, report["total_files"],
+		"raw-file walker must skip paths outside LanguageGlobs: "+
+			"only keep.go should reach the composition analyzer")
+}
+
+func TestStaticService_AnalyzeFolder_LanguageGlobs_FiltersUASTWalk(t *testing.T) {
+	t.Parallel()
+
+	tmpDir := t.TempDir()
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "keep.go"),
+			[]byte("package main\nfunc F() { x := 1; _ = x }\n"), 0o600))
+	require.NoError(t,
+		os.WriteFile(filepath.Join(tmpDir, "drop.py"),
+			[]byte("def f():\n    x = 1\n    return x\n"), 0o600))
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.LanguageGlobs = []string{"*.go"}
+	svc.PerFile = true
+
+	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{"complexity"})
+	require.NoError(t, err)
+	require.Contains(t, results, "complexity")
+
+	perFile := svc.PerFileResults()["complexity"]
+	assert.Contains(t, perFile, "keep.go",
+		"Go file must reach the complexity analyzer when pathspec *.go is active")
+	assert.NotContains(t, perFile, "drop.py",
+		"Python file must be filtered out before the parser runs")
+}
+
+func TestMatchesLanguageGlobs_NilGlobs_AllowsAnyName(t *testing.T) {
+	t.Parallel()
+
+	assert.True(t, analyze.LanguageGlobMatcher("anything.go", nil),
+		"nil globs must be treated as no-filter and return true")
+}
+
+func TestMatchesLanguageGlobs_MultipleGlobs_MatchesUnion(t *testing.T) {
+	t.Parallel()
+
+	globs := []string{"*.go", "Dockerfile"}
+
+	assert.True(t, analyze.LanguageGlobMatcher("main.go", globs))
+	assert.True(t, analyze.LanguageGlobMatcher("Dockerfile", globs))
+	assert.False(t, analyze.LanguageGlobMatcher("main.py", globs),
+		"a name matching neither glob must be rejected")
+}
+
+func TestMatchesLanguageGlobs_StarDotGo_MatchesGoBasename(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name   string
+		path   string
+		want   bool
+		reason string
+	}{
+		{"go file", "foo.go", true, "*.go glob must match plain .go"},
+		{"nested go file", "/abs/dir/foo.go", true, "match on basename, not full path"},
+		{"python file", "foo.py", false, "*.go must not match .py"},
+		{"no extension", "Makefile", false, "*.go must not match extensionless"},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			got := analyze.LanguageGlobMatcher(tt.path, []string{"*.go"})
+			assert.Equal(t, tt.want, got, tt.reason)
+		})
+	}
+}
diff --git a/internal/analyzers/analyze/static_test.go b/internal/analyzers/analyze/static_test.go
index 23669f5..5ecb7ab 100644
--- a/internal/analyzers/analyze/static_test.go
+++ b/internal/analyzers/analyze/static_test.go
@@ -3,6 +3,7 @@ package analyze_test
 import (
 	"bytes"
 	"context"
+	"encoding/json"
 	"fmt"
 	"io/fs"
 	"os"
@@ -17,6 +18,7 @@ import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/cohesion"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/comments"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/complexity"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/halstead"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/imports"
@@ -134,7 +136,7 @@ func TestStaticService_AnalyzeFolder_SkipsPermissionDeniedDirectory(t *testing.T
 		require.NoError(t, os.Chmod(blockedDir, 0o750))
 	}()
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{"complexity"})
 	require.NoError(t, err)
 	require.Contains(t, results, "complexity")
@@ -186,14 +188,14 @@ func TestStampSourceFile(t *testing.T) {
 		},
 	}
 
-	analyze.StampSourceFile(reports, "/repo/pkg/auth/handler.go")
+	analyze.StampSourceFile(reports, "/repo/pkg/auth/handler.go", "/repo")
 
 	functions, ok := reports["cohesion"]["functions"].([]map[string]any)
 	require.True(t, ok)
 	require.Len(t, functions, 2)
 
 	for _, fn := range functions {
-		require.Equal(t, "/repo/pkg/auth/handler.go", fn["_source_file"])
+		require.Equal(t, "pkg/auth/handler.go", fn["_source_file"])
 	}
 }
 
@@ -203,7 +205,7 @@ func TestStampSourceFile_EmptyReport(t *testing.T) {
 	reports := map[string]analyze.Report{}
 
 	require.NotPanics(t, func() {
-		analyze.StampSourceFile(reports, "/some/path.go")
+		analyze.StampSourceFile(reports, "/some/path.go", "")
 	})
 }
 
@@ -219,12 +221,10 @@ func TestStampSourceFile_NoCollections(t *testing.T) {
 	}
 
 	require.NotPanics(t, func() {
-		analyze.StampSourceFile(reports, "/some/path.go")
+		analyze.StampSourceFile(reports, "/some/path.go", "")
 	})
 }
 
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
-
 func TestStampSourceFile_TypedCollection(t *testing.T) {
 	t.Parallel()
 
@@ -268,25 +268,23 @@ func TestStampSourceFile_TypedCollection(t *testing.T) {
 		},
 	}
 
-	analyze.StampSourceFile(reports, "/repo/pkg/foo.go")
+	analyze.StampSourceFile(reports, "/repo/pkg/foo.go", "/repo")
 
 	stamped, ok := reports["complexity"]["functions"].(analyze.TypedCollection)
 	require.True(t, ok)
-	assert.Equal(t, "/repo/pkg/foo.go", stamped.SourceFile)
+	assert.Equal(t, "pkg/foo.go", stamped.SourceFile)
 
 	// Verify converter produces maps with _source_file.
 	maps := stamped.ToMaps(stamped.Items, stamped.SourceFile)
 	require.Len(t, maps, 2)
-	assert.Equal(t, "/repo/pkg/foo.go", maps[0]["_source_file"])
-	assert.Equal(t, "/repo/pkg/foo.go", maps[1]["_source_file"])
+	assert.Equal(t, "pkg/foo.go", maps[0]["_source_file"])
+	assert.Equal(t, "pkg/foo.go", maps[1]["_source_file"])
 }
 
-// FRD: specs/frds/FRD-20260311-cap-static-workers.md.
-
 func TestStaticService_ResolveMaxWorkers_DefaultCapsAtEight(t *testing.T) {
 	t.Parallel()
 
-	svc := analyze.NewStaticService(nil)
+	svc := analyze.NewStaticService(nil, nil)
 	got := svc.ResolveMaxWorkers()
 
 	want := min(runtime.NumCPU(), analyze.DefaultStaticMaxWorkers)
@@ -308,7 +306,7 @@ func TestStaticService_AnalyzeFolder_RespectsMaxWorkers(t *testing.T) {
 		[]byte("package a\nfunc B() {}\n"), 0o600,
 	))
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = 1
 
 	results, err := svc.AnalyzeFolder(context.Background(), tmpDir, []string{"complexity"})
@@ -321,18 +319,16 @@ func TestStaticService_ResolveMaxWorkers_ExplicitOverride(t *testing.T) {
 
 	const explicitWorkers = 16
 
-	svc := analyze.NewStaticService(nil)
+	svc := analyze.NewStaticService(nil, nil)
 	svc.MaxWorkers = explicitWorkers
 
 	require.Equal(t, explicitWorkers, svc.ResolveMaxWorkers())
 }
 
-// FRD: specs/frds/FRD-20260311-static-malloc-trim.md.
-
 func TestStaticService_ResolveMallocTrimInterval_Default(t *testing.T) {
 	t.Parallel()
 
-	svc := analyze.NewStaticService(nil)
+	svc := analyze.NewStaticService(nil, nil)
 
 	require.Equal(t, analyze.DefaultMallocTrimInterval, svc.ResolveMallocTrimInterval())
 }
@@ -342,7 +338,7 @@ func TestStaticService_ResolveMallocTrimInterval_ExplicitOverride(t *testing.T)
 
 	const customInterval = 100
 
-	svc := analyze.NewStaticService(nil)
+	svc := analyze.NewStaticService(nil, nil)
 	svc.MallocTrimInterval = customInterval
 
 	require.Equal(t, customInterval, svc.ResolveMallocTrimInterval())
@@ -351,7 +347,7 @@ func TestStaticService_ResolveMallocTrimInterval_ExplicitOverride(t *testing.T)
 func TestStaticService_ResolveMallocTrimInterval_Disabled(t *testing.T) {
 	t.Parallel()
 
-	svc := analyze.NewStaticService(nil)
+	svc := analyze.NewStaticService(nil, nil)
 	svc.MallocTrimInterval = -1
 
 	require.Equal(t, -1, svc.ResolveMallocTrimInterval())
@@ -374,7 +370,7 @@ func TestStaticService_AnalyzeFolder_CallsMallocTrim(t *testing.T) {
 
 	var trimCalls atomic.Int64
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = 1
 	svc.MallocTrimInterval = trimInterval
 	svc.NativeMemoryReleaseFn = func() { trimCalls.Add(1) }
@@ -400,7 +396,7 @@ func TestStaticService_AnalyzeFolder_NoTrimWhenDisabled(t *testing.T) {
 
 	var trimCalls atomic.Int64
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = 1
 	svc.MallocTrimInterval = -1
 	svc.NativeMemoryReleaseFn = func() { trimCalls.Add(1) }
@@ -411,8 +407,6 @@ func TestStaticService_AnalyzeFolder_NoTrimWhenDisabled(t *testing.T) {
 	require.Zero(t, trimCalls.Load())
 }
 
-// FRD: specs/frds/FRD-20260311-summary-only-aggregation.md.
-
 func TestResolveAggregationMode_TextIsSummaryOnly(t *testing.T) {
 	t.Parallel()
 
@@ -451,7 +445,7 @@ func TestStaticService_SummaryOnly_MetricsPresent(t *testing.T) {
 		[]byte("package main\nfunc A() { x := 1; _ = x }\nfunc B() { y := 2; _ = y }\n"), 0o600,
 	))
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = 1
 	svc.MallocTrimInterval = -1
 	svc.AggregationMode = analyze.AggregationModeSummaryOnly
@@ -467,8 +461,6 @@ func TestStaticService_SummaryOnly_MetricsPresent(t *testing.T) {
 	require.Contains(t, report, "total_complexity")
 }
 
-// FRD: specs/frds/FRD-20260312-static-budget-tuning.md.
-
 func TestStaticService_SpillThreshold_AppliedToAggregators(t *testing.T) {
 	t.Parallel()
 
@@ -481,7 +473,7 @@ func TestStaticService_SpillThreshold_AppliedToAggregators(t *testing.T) {
 
 	const customThreshold = 5000
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.MaxWorkers = 1
 	svc.MallocTrimInterval = -1
 	svc.SpillThreshold = customThreshold
@@ -497,8 +489,6 @@ func TestStaticService_SpillThreshold_AppliedToAggregators(t *testing.T) {
 	assert.Equal(t, customThreshold, svc.SpillThreshold)
 }
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 func TestStaticService_ProgressFunc_CalledDuringAnalysis(t *testing.T) {
 	t.Parallel()
 
@@ -510,7 +500,7 @@ func TestStaticService_ProgressFunc_CalledDuringAnalysis(t *testing.T) {
 		writeTestGoFile(t, dir, fmt.Sprintf("file%d.go", i))
 	}
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.NativeMemoryReleaseFn = func() {}
 	svc.ProgressInterval = 2
 
@@ -538,7 +528,7 @@ func TestStaticService_ProgressFunc_Nil_NoError(t *testing.T) {
 	dir := t.TempDir()
 	writeTestGoFile(t, dir, "file.go")
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.NativeMemoryReleaseFn = func() {}
 
 	// ProgressFunc is nil — should not panic.
@@ -547,8 +537,6 @@ func TestStaticService_ProgressFunc_Nil_NoError(t *testing.T) {
 	require.NotEmpty(t, results)
 }
 
-// FRD: specs/frds/FRD-20260312-static-plot-multipage.md.
-
 func TestStaticService_FormatPlotPages_ProducesHTML(t *testing.T) {
 	t.Parallel()
 
@@ -559,7 +547,7 @@ func TestStaticService_FormatPlotPages_ProducesHTML(t *testing.T) {
 	dir := t.TempDir()
 	writeTestGoFile(t, dir, "main.go")
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.NativeMemoryReleaseFn = func() {}
 	svc.AggregationMode = analyze.AggregationModeFull
 
@@ -591,7 +579,7 @@ func TestStaticService_FormatPlotPages_SkipsUnregisteredAnalyzers(t *testing.T)
 	dir := t.TempDir()
 	writeTestGoFile(t, dir, "main.go")
 
-	svc := analyze.NewStaticService(testStaticAnalyzers())
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
 	svc.NativeMemoryReleaseFn = func() {}
 
 	results := map[string]analyze.Report{
@@ -627,3 +615,101 @@ func writeTestGoFile(t *testing.T, dir, name string) {
 
 	require.NoError(t, os.WriteFile(path, content, 0o600))
 }
+
+func TestStaticService_PerFile_FieldExists(t *testing.T) {
+	t.Parallel()
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.PerFile = true
+
+	assert.True(t, svc.PerFile)
+}
+
+func TestStaticService_PerFile_AnalyzeFolderRetainsPerFileResults(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	writeTestGoFile(t, dir, "a.go")
+	writeTestGoFile(t, dir, "b.go")
+	writeTestGoFile(t, dir, "c.go")
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.NativeMemoryReleaseFn = func() {}
+	svc.PerFile = true
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	_ = results
+
+	perFile := svc.PerFileResults()
+	require.NotNil(t, perFile, "per-file results must be present when PerFile=true")
+
+	// Each analyzer should have 3 per-file entries.
+	for analyzerName, fileResults := range perFile {
+		assert.Len(t, fileResults, 3,
+			"analyzer %s must have 3 per-file entries", analyzerName)
+	}
+}
+
+func TestStaticService_PerFile_FormatJSONIncludesFiles(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	writeTestGoFile(t, dir, "a.go")
+	writeTestGoFile(t, dir, "b.go")
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.NativeMemoryReleaseFn = func() {}
+	svc.Renderer = &renderer.DefaultStaticRenderer{}
+	svc.PerFile = true
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	var buf bytes.Buffer
+	require.NoError(t, svc.FormatJSON(results, &buf))
+
+	jsonStr := buf.String()
+	assert.Contains(t, jsonStr, `"files"`, "JSON must include files array")
+	assert.Contains(t, jsonStr, `"file_path"`, "files entries must have file_path")
+}
+
+func TestStaticService_PerFile_DisabledReturnsNil(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	writeTestGoFile(t, dir, "a.go")
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.NativeMemoryReleaseFn = func() {}
+	// PerFile is false (default).
+
+	_, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	assert.Nil(t, svc.PerFileResults(), "per-file results must be nil when PerFile is false")
+}
+
+func TestStaticService_FormatPlotPages_EmitsReportJSON(t *testing.T) {
+	t.Parallel()
+
+	svc := analyze.NewStaticService(testStaticAnalyzers(), nil)
+	svc.NativeMemoryReleaseFn = func() {}
+
+	results := map[string]analyze.Report{
+		"complexity": {"total_functions": 1},
+	}
+
+	outputDir := filepath.Join(t.TempDir(), "plot-output")
+
+	require.NoError(t, svc.FormatPlotPages([]string{"complexity"}, results, outputDir))
+
+	reportPath := filepath.Join(outputDir, "report.json")
+	data, err := os.ReadFile(reportPath)
+	require.NoError(t, err, "report.json must exist after FormatPlotPages")
+
+	var parsed map[string]any
+	require.NoError(t, json.Unmarshal(data, &parsed), "report.json must be valid JSON")
+	assert.Contains(t, parsed, "complexity", "report.json must contain analyzer results")
+}
diff --git a/internal/analyzers/analyze/tick_bounds.go b/internal/analyzers/analyze/tick_bounds.go
new file mode 100644
index 0000000..56ba040
--- /dev/null
+++ b/internal/analyzers/analyze/tick_bounds.go
@@ -0,0 +1,46 @@
+package analyze
+
+import "time"
+
+// TickBounds holds the time boundaries of a single tick.
+type TickBounds struct {
+	StartTime time.Time
+	EndTime   time.Time
+}
+
+// FormatStartTime returns StartTime as an RFC 3339 string, or empty if zero.
+func (b TickBounds) FormatStartTime() string {
+	if b.StartTime.IsZero() {
+		return ""
+	}
+
+	return b.StartTime.UTC().Format(time.RFC3339)
+}
+
+// FormatEndTime returns EndTime as an RFC 3339 string, or empty if zero.
+func (b TickBounds) FormatEndTime() string {
+	if b.EndTime.IsZero() {
+		return ""
+	}
+
+	return b.EndTime.UTC().Format(time.RFC3339)
+}
+
+// BuildTickBounds extracts tick boundaries from a slice of TICKs.
+// Returns a map from tick index to its time bounds.
+func BuildTickBounds(ticks []TICK) map[int]TickBounds {
+	if len(ticks) == 0 {
+		return nil
+	}
+
+	result := make(map[int]TickBounds, len(ticks))
+
+	for _, tick := range ticks {
+		result[tick.Tick] = TickBounds{
+			StartTime: tick.StartTime,
+			EndTime:   tick.EndTime,
+		}
+	}
+
+	return result
+}
diff --git a/internal/analyzers/analyze/tick_bounds_test.go b/internal/analyzers/analyze/tick_bounds_test.go
new file mode 100644
index 0000000..7b3a6de
--- /dev/null
+++ b/internal/analyzers/analyze/tick_bounds_test.go
@@ -0,0 +1,88 @@
+package analyze_test
+
+import (
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+var (
+	testTime1 = time.Date(2024, 1, 15, 10, 0, 0, 0, time.UTC)
+	testTime2 = time.Date(2024, 1, 16, 12, 0, 0, 0, time.UTC)
+	testTime3 = time.Date(2024, 1, 17, 14, 0, 0, 0, time.UTC)
+)
+
+func TestBuildTickBounds_Empty(t *testing.T) {
+	t.Parallel()
+
+	result := analyze.BuildTickBounds(nil)
+
+	assert.Empty(t, result)
+}
+
+func TestBuildTickBounds_SingleTick(t *testing.T) {
+	t.Parallel()
+
+	ticks := []analyze.TICK{
+		{Tick: 0, StartTime: testTime1, EndTime: testTime2},
+	}
+
+	result := analyze.BuildTickBounds(ticks)
+
+	require.Len(t, result, 1)
+	assert.Equal(t, testTime1, result[0].StartTime)
+	assert.Equal(t, testTime2, result[0].EndTime)
+}
+
+func TestBuildTickBounds_MultipleTicks(t *testing.T) {
+	t.Parallel()
+
+	ticks := []analyze.TICK{
+		{Tick: 0, StartTime: testTime1, EndTime: testTime2},
+		{Tick: 1, StartTime: testTime2, EndTime: testTime3},
+	}
+
+	result := analyze.BuildTickBounds(ticks)
+
+	require.Len(t, result, 2)
+	assert.Equal(t, testTime1, result[0].StartTime)
+	assert.Equal(t, testTime2, result[1].StartTime)
+	assert.Equal(t, testTime3, result[1].EndTime)
+}
+
+func TestBuildTickBounds_ZeroTimesSkipped(t *testing.T) {
+	t.Parallel()
+
+	ticks := []analyze.TICK{
+		{Tick: 0},
+		{Tick: 1, StartTime: testTime1, EndTime: testTime2},
+	}
+
+	result := analyze.BuildTickBounds(ticks)
+
+	require.Len(t, result, 2)
+	assert.True(t, result[0].StartTime.IsZero())
+	assert.Equal(t, testTime1, result[1].StartTime)
+}
+
+func TestTickBoundsFormatStartTime(t *testing.T) {
+	t.Parallel()
+
+	bounds := analyze.TickBounds{StartTime: testTime1, EndTime: testTime2}
+
+	assert.Equal(t, "2024-01-15T10:00:00Z", bounds.FormatStartTime())
+	assert.Equal(t, "2024-01-16T12:00:00Z", bounds.FormatEndTime())
+}
+
+func TestTickBoundsFormatStartTime_Zero(t *testing.T) {
+	t.Parallel()
+
+	bounds := analyze.TickBounds{}
+
+	assert.Empty(t, bounds.FormatStartTime())
+	assert.Empty(t, bounds.FormatEndTime())
+}
diff --git a/internal/analyzers/analyze/typed_collection.go b/internal/analyzers/analyze/typed_collection.go
index 8f61a55..3627bef 100644
--- a/internal/analyzers/analyze/typed_collection.go
+++ b/internal/analyzers/analyze/typed_collection.go
@@ -1,7 +1,5 @@
 package analyze
 
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
-
 // ItemConverter converts a typed items slice and source file path into []map[string]any.
 // The sourceFile parameter is the path stamped by StampSourceFile; when non-empty, the
 // converter should include it as "_source_file" in each output map.
@@ -13,6 +11,8 @@ type ItemConverter func(items any, sourceFile string) []map[string]any
 type TypedCollection struct {
 	Items      any           // concrete typed slice (e.g., []FunctionMetrics).
 	SourceFile string        // stamped by StampSourceFile.
+	Language   string        // stamped by StampLanguage.
+	Directory  string        // stamped by StampSourceFile (filepath.Dir of relative path).
 	ToMaps     ItemConverter // deferred converter.
 }
 
@@ -27,3 +27,9 @@ func (tc TypedCollection) MapSlice() []map[string]any {
 
 // SourceFileKey is the report key used to stamp the originating source file.
 const SourceFileKey = "_source_file"
+
+// LanguageKey is the report key used to stamp the detected programming language.
+const LanguageKey = "_language"
+
+// DirectoryKey is the report key used to stamp the parent directory of the source file.
+const DirectoryKey = "_directory"
diff --git a/internal/analyzers/anomaly/analyzer.go b/internal/analyzers/anomaly/analyzer.go
index 96d97af..dd0fc5f 100644
--- a/internal/analyzers/anomaly/analyzer.go
+++ b/internal/analyzers/anomaly/analyzer.go
@@ -503,6 +503,7 @@ func ticksToReport(
 		"anomalies":       anomalies,
 		"threshold":       threshold,
 		"window_size":     window,
+		"tick_bounds":     analyze.BuildTickBounds(ticks),
 	}
 }
 
diff --git a/internal/analyzers/anomaly/enrich_store_test.go b/internal/analyzers/anomaly/enrich_store_test.go
index 88e2cc2..233310c 100644
--- a/internal/analyzers/anomaly/enrich_store_test.go
+++ b/internal/analyzers/anomaly/enrich_store_test.go
@@ -1,7 +1,5 @@
 package anomaly
 
-// FRD: specs/frds/FRD-20260301-anomaly-enrich-from-store.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/anomaly/enrich_test.go b/internal/analyzers/anomaly/enrich_test.go
index 360be7c..3bc45b6 100644
--- a/internal/analyzers/anomaly/enrich_test.go
+++ b/internal/analyzers/anomaly/enrich_test.go
@@ -1,7 +1,5 @@
 package anomaly
 
-// FRD: specs/frds/FRD-20260301-anomaly-enrich-from-store.md.
-
 import (
 	"testing"
 
diff --git a/internal/analyzers/anomaly/metrics.go b/internal/analyzers/anomaly/metrics.go
index 0ca2591..76096bb 100644
--- a/internal/analyzers/anomaly/metrics.go
+++ b/internal/analyzers/anomaly/metrics.go
@@ -71,12 +71,14 @@ type AggregateData struct {
 
 // TimeSeriesEntry holds per-tick data for the time series output.
 type TimeSeriesEntry struct {
-	Tick              int        `json:"tick"               yaml:"tick"`
-	Metrics           RawMetrics `json:"metrics"            yaml:"metrics"`
-	IsAnomaly         bool       `json:"is_anomaly"         yaml:"is_anomaly"`
-	ChurnZScore       float64    `json:"churn_z_score"      yaml:"churn_z_score"`
-	LanguageDiversity int        `json:"language_diversity" yaml:"language_diversity"`
-	AuthorCount       int        `json:"author_count"       yaml:"author_count"`
+	Tick              int        `json:"tick"                 yaml:"tick"`
+	StartTime         string     `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime           string     `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	Metrics           RawMetrics `json:"metrics"              yaml:"metrics"`
+	IsAnomaly         bool       `json:"is_anomaly"           yaml:"is_anomaly"`
+	ChurnZScore       float64    `json:"churn_z_score"        yaml:"churn_z_score"`
+	LanguageDiversity int        `json:"language_diversity"   yaml:"language_diversity"`
+	AuthorCount       int        `json:"author_count"         yaml:"author_count"`
 }
 
 // --- External Anomaly Types ---.
@@ -184,7 +186,7 @@ func computeTimeSeries(input *ReportData) []TimeSeriesEntry {
 			churnZ = churnScores[i]
 		}
 
-		entries[i] = TimeSeriesEntry{
+		entry := TimeSeriesEntry{
 			Tick: tick,
 			Metrics: RawMetrics{
 				FilesChanged:      tm.FilesChanged,
@@ -199,6 +201,13 @@ func computeTimeSeries(input *ReportData) []TimeSeriesEntry {
 			LanguageDiversity: len(tm.Languages),
 			AuthorCount:       len(tm.AuthorIDs),
 		}
+
+		if bounds, hasBounds := input.TickBounds[tick]; hasBounds {
+			entry.StartTime = bounds.FormatStartTime()
+			entry.EndTime = bounds.FormatEndTime()
+		}
+
+		entries[i] = entry
 	}
 
 	return entries
@@ -210,6 +219,7 @@ func computeTimeSeries(input *ReportData) []TimeSeriesEntry {
 type ReportData struct {
 	Anomalies         []Record
 	TickMetrics       map[int]*TickMetrics
+	TickBounds        map[int]analyze.TickBounds
 	Threshold         float32
 	WindowSize        int
 	ExternalAnomalies []ExternalAnomaly
@@ -307,6 +317,10 @@ func ParseReportData(report analyze.Report) (*ReportData, error) {
 		data.ExternalSummaries = v
 	}
 
+	if v, ok := report["tick_bounds"].(map[int]analyze.TickBounds); ok {
+		data.TickBounds = v
+	}
+
 	return data, nil
 }
 
diff --git a/internal/analyzers/anomaly/store_writer_test.go b/internal/analyzers/anomaly/store_writer_test.go
index 972d61a..6f355a7 100644
--- a/internal/analyzers/anomaly/store_writer_test.go
+++ b/internal/analyzers/anomaly/store_writer_test.go
@@ -1,7 +1,5 @@
 package anomaly
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/burndown/history.go b/internal/analyzers/burndown/history.go
index 520c2eb..4bb8872 100644
--- a/internal/analyzers/burndown/history.go
+++ b/internal/analyzers/burndown/history.go
@@ -98,6 +98,10 @@ type HistoryAnalyzer struct {
 	TrackFiles           bool
 	HibernationToDisk    bool
 	lastCommitTime       time.Time
+
+	// mismatch tracks src-mismatch reset events (rate-limited logging,
+	// per-chunk and cumulative counters). Surfaced via MismatchStats.
+	mismatch mismatchTracker
 }
 
 const (
@@ -166,6 +170,12 @@ func NewHistoryAnalyzer() *HistoryAnalyzer {
 	return ha
 }
 
+// MismatchStats returns cumulative src-mismatch counters for this analyzer.
+// See [MismatchStats] for the operational meaning of these numbers.
+func (b *HistoryAnalyzer) MismatchStats() MismatchStats {
+	return b.mismatch.snapshot()
+}
+
 // ListConfigurationOptions returns the configuration options for the analyzer.
 func (b *HistoryAnalyzer) ListConfigurationOptions() []pipeline.ConfigurationOption {
 	return []pipeline.ConfigurationOption{
diff --git a/internal/analyzers/burndown/history_changes.go b/internal/analyzers/burndown/history_changes.go
index a43fdf2..083ff90 100644
--- a/internal/analyzers/burndown/history_changes.go
+++ b/internal/analyzers/burndown/history_changes.go
@@ -133,7 +133,7 @@ func (b *HistoryAnalyzer) countDeletionLines(
 
 // forceRemoveFile handles treap/blob length mismatch by force-deleting the file tracking.
 func (b *HistoryAnalyzer) forceRemoveFile(shard *Shard, id PathID, name string, file *burndown.File) {
-	log.Printf("burndown: src mismatch for deletion %s (tracked=%d), force-removing", name, file.Len())
+	b.mismatch.recordForceRemove(name, file.Len())
 	file.Delete()
 
 	shard.filesByID[id] = nil
@@ -361,8 +361,7 @@ func (b *HistoryAnalyzer) resetAndReinsert(
 	shard *Shard, change *gitlib.Change, id PathID, author int,
 	cache map[gitlib.Hash]*pkgplumbing.CachedBlob,
 ) error {
-	log.Printf("burndown: src mismatch for %s (tracked=%d, diff_old=...), resetting",
-		change.To.Name, shard.filesByID[id].Len())
+	b.mismatch.recordReset(change.To.Name, shard.filesByID[id].Len())
 
 	shard.filesByID[id] = nil
 	b.removeActiveID(shard, id)
diff --git a/internal/analyzers/burndown/history_lifecycle.go b/internal/analyzers/burndown/history_lifecycle.go
index 1aa3ca5..acaf18f 100644
--- a/internal/analyzers/burndown/history_lifecycle.go
+++ b/internal/analyzers/burndown/history_lifecycle.go
@@ -2,6 +2,7 @@ package burndown
 
 import (
 	"fmt"
+	"log"
 	"os"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
@@ -118,6 +119,8 @@ func (b *HistoryAnalyzer) mergeTicks(other *HistoryAnalyzer) {
 
 // Hibernate releases resources between processing phases.
 func (b *HistoryAnalyzer) Hibernate() error {
+	b.logChunkMismatchSummary()
+
 	err := b.ensureSpillDir()
 	if err != nil {
 		return fmt.Errorf("burndown spill dir: %w", err)
@@ -137,6 +140,24 @@ func (b *HistoryAnalyzer) Hibernate() error {
 	return nil
 }
 
+// logChunkMismatchSummary emits a single line summarizing src-mismatch
+// resets recorded since the last chunk boundary, then re-baselines the
+// counter for the next chunk. Silent when no mismatches happened.
+func (b *HistoryAnalyzer) logChunkMismatchSummary() {
+	delta := b.mismatch.chunkDelta()
+	if delta == 0 {
+		b.mismatch.resetChunkBaseline()
+
+		return
+	}
+
+	stats := b.mismatch.snapshot()
+	log.Printf("burndown: chunk src-mismatch summary chunk_resets=%d cumulative_resets=%d cumulative_force_removes=%d",
+		delta, stats.Resets, stats.ForceRemoves)
+
+	b.mismatch.resetChunkBaseline()
+}
+
 // hibernateShard shrinks treap pools, spills to disk, and resets tracking maps.
 func (b *HistoryAnalyzer) hibernateShard(shard *Shard, idx int) error {
 	shard.mu.Lock()
diff --git a/internal/analyzers/burndown/mismatch_tracker.go b/internal/analyzers/burndown/mismatch_tracker.go
new file mode 100644
index 0000000..82385ee
--- /dev/null
+++ b/internal/analyzers/burndown/mismatch_tracker.go
@@ -0,0 +1,124 @@
+package burndown
+
+import (
+	"log"
+	"sync/atomic"
+	"time"
+)
+
+// mismatchLogIntervalNanos throttles the per-event src-mismatch log line.
+// Bursts are common (one large commit can reset thousands of file states
+// in a single tick after the blob-pipeline cap silently skips a monster
+// commit upstream); without throttling, the log becomes the long pole.
+// 1 second across all shards keeps the operator-facing signal while
+// dropping the cost from O(mismatches) stdout flushes to O(seconds).
+const mismatchLogIntervalNanos = int64(time.Second)
+
+// mismatchTracker counts src-mismatch reset events on the burndown analyzer
+// and rate-limits the per-event log line. All fields are accessed atomically
+// so the tracker is safe to call from per-shard goroutines.
+//
+// The counter splits resets (file present, line count diverged) from
+// force-removes (file deleted while line count diverged) so consumers can
+// tell apart the two recovery paths handled by history_changes.go.
+type mismatchTracker struct {
+	resets        atomic.Int64
+	forceRemoves  atomic.Int64
+	dropped       atomic.Int64 // events suppressed since the last emitted log line.
+	lastLogNanos  atomic.Int64 // monotonic-ish timestamp of last emitted log line.
+	chunkBaseline atomic.Int64 // resets+forceRemoves at start of current chunk.
+}
+
+// recordReset bumps the reset counter and emits a rate-limited log line.
+// name is the file path; tracked is the analyzer's stale line count for it.
+func (t *mismatchTracker) recordReset(name string, tracked int) {
+	t.resets.Add(1)
+	t.maybeLog(name, tracked, "resetting")
+}
+
+// recordForceRemove bumps the force-remove counter and emits a rate-limited
+// log line. Mirrors recordReset for the deletion path so the two recovery
+// modes show up as separate counters.
+func (t *mismatchTracker) recordForceRemove(name string, tracked int) {
+	t.forceRemoves.Add(1)
+	t.maybeLog("deletion "+name, tracked, "force-removing")
+}
+
+// maybeLog emits a log line at most once per mismatchLogIntervalNanos, atomic
+// across shards. Suppressed events are counted in `dropped` and surfaced as a
+// `dropped=N since last` suffix on the next emitted line.
+func (t *mismatchTracker) maybeLog(name string, tracked int, kind string) {
+	now := time.Now().UnixNano()
+	last := t.lastLogNanos.Load()
+
+	if now-last < mismatchLogIntervalNanos {
+		t.dropped.Add(1)
+
+		return
+	}
+
+	if !t.lastLogNanos.CompareAndSwap(last, now) {
+		// Another shard claimed this slot — count as dropped to keep the
+		// total consistent with one-log-per-interval semantics.
+		t.dropped.Add(1)
+
+		return
+	}
+
+	dropped := t.dropped.Swap(0)
+	if dropped == 0 {
+		log.Printf("burndown: src mismatch for %s (tracked=%d, diff_old=...), %s",
+			name, tracked, kind)
+
+		return
+	}
+
+	log.Printf("burndown: src mismatch for %s (tracked=%d, diff_old=...), %s [dropped=%d since last]",
+		name, tracked, kind, dropped)
+}
+
+// snapshot returns the running counts. Used by Hibernate() for chunk summaries
+// and exposed to external observers via HistoryAnalyzer.MismatchStats.
+func (t *mismatchTracker) snapshot() MismatchStats {
+	return MismatchStats{
+		Resets:       t.resets.Load(),
+		ForceRemoves: t.forceRemoves.Load(),
+	}
+}
+
+// resetChunkBaseline marks the cumulative count at the start of a chunk so
+// per-chunk deltas can be reported on the next Hibernate.
+func (t *mismatchTracker) resetChunkBaseline() {
+	t.chunkBaseline.Store(t.resets.Load() + t.forceRemoves.Load())
+}
+
+// chunkDelta returns the number of mismatch events recorded since the last
+// resetChunkBaseline call.
+func (t *mismatchTracker) chunkDelta() int64 {
+	return (t.resets.Load() + t.forceRemoves.Load()) - t.chunkBaseline.Load()
+}
+
+// MismatchStats reports cumulative src-mismatch reset events on the burndown
+// analyzer. Consumers (tests, observability) read these via
+// HistoryAnalyzer.MismatchStats().
+//
+// Resets count file modifications where the analyzer's tracked line count
+// did not match the diff's OldLinesOfCode — typically after the blob
+// pipeline silently skipped a "monster" commit (see ErrCommitTooLarge), so
+// the analyzer's state lags reality by one or more commits' worth of edits.
+// ForceRemoves count the same divergence on the deletion path.
+//
+// A non-zero value implies burndown's per-file survival history is stale
+// for the affected files at the reset point — the file is treated as a
+// fresh insertion thereafter. Surface this to operators when interpreting
+// per-file results on repos with large mass-update commits (vendor moves,
+// generated-code regenerations, Pods updates).
+type MismatchStats struct {
+	Resets       int64
+	ForceRemoves int64
+}
+
+// Total returns the sum of resets and force-removes.
+func (s MismatchStats) Total() int64 {
+	return s.Resets + s.ForceRemoves
+}
diff --git a/internal/analyzers/burndown/mismatch_tracker_test.go b/internal/analyzers/burndown/mismatch_tracker_test.go
new file mode 100644
index 0000000..b3282ba
--- /dev/null
+++ b/internal/analyzers/burndown/mismatch_tracker_test.go
@@ -0,0 +1,164 @@
+package burndown
+
+import (
+	"sync"
+	"testing"
+)
+
+func TestMismatchTracker_RecordReset_BumpsResetsCounter(t *testing.T) {
+	t.Parallel()
+
+	var tr mismatchTracker
+
+	tr.recordReset("foo.go", 12)
+	tr.recordReset("bar.go", 34)
+
+	stats := tr.snapshot()
+	if stats.Resets != 2 {
+		t.Errorf("Resets = %d, want 2", stats.Resets)
+	}
+
+	if stats.ForceRemoves != 0 {
+		t.Errorf("ForceRemoves = %d, want 0", stats.ForceRemoves)
+	}
+}
+
+func TestMismatchTracker_RecordForceRemove_BumpsForceRemovesCounter(t *testing.T) {
+	t.Parallel()
+
+	var tr mismatchTracker
+
+	tr.recordForceRemove("foo.go", 99)
+
+	stats := tr.snapshot()
+	if stats.ForceRemoves != 1 {
+		t.Errorf("ForceRemoves = %d, want 1", stats.ForceRemoves)
+	}
+
+	if stats.Resets != 0 {
+		t.Errorf("Resets = %d, want 0", stats.Resets)
+	}
+}
+
+func TestMismatchTracker_RateLimit_DropsBurstWithinInterval(t *testing.T) {
+	t.Parallel()
+
+	var tr mismatchTracker
+
+	// Fire a burst of 1000 resets back-to-back. Only the first should win
+	// the log slot; the rest must be counted as dropped.
+	for range 1000 {
+		tr.recordReset("foo.go", 1)
+	}
+
+	if got := tr.dropped.Load(); got != 999 {
+		t.Errorf("dropped = %d, want 999 (1000 events, 1 logged, 999 suppressed)", got)
+	}
+
+	if got := tr.snapshot().Resets; got != 1000 {
+		t.Errorf("Resets = %d, want 1000 (counter must record every event regardless of log throttle)", got)
+	}
+}
+
+func TestMismatchTracker_RateLimit_AllowsAfterInterval(t *testing.T) {
+	t.Parallel()
+
+	var tr mismatchTracker
+
+	// First call wins the slot.
+	tr.recordReset("foo.go", 1)
+	first := tr.lastLogNanos.Load()
+
+	// Force the next call into a fresh interval by rewinding the timestamp.
+	tr.lastLogNanos.Store(first - mismatchLogIntervalNanos - 1)
+
+	// Reset dropped so we can verify the second call resets the dropped tail.
+	tr.dropped.Store(5)
+
+	tr.recordReset("bar.go", 2)
+
+	if got := tr.dropped.Load(); got != 0 {
+		t.Errorf("dropped after fresh interval = %d, want 0 (Swap should clear it on emit)", got)
+	}
+
+	if tr.lastLogNanos.Load() == first {
+		t.Errorf("lastLogNanos did not advance — second call did not claim the slot")
+	}
+}
+
+func TestMismatchTracker_ChunkDelta_TracksSinceBaseline(t *testing.T) {
+	t.Parallel()
+
+	var tr mismatchTracker
+
+	tr.recordReset("a", 1)
+	tr.recordReset("b", 1)
+	tr.resetChunkBaseline()
+
+	if got := tr.chunkDelta(); got != 0 {
+		t.Errorf("chunkDelta after baseline = %d, want 0", got)
+	}
+
+	tr.recordReset("c", 1)
+	tr.recordForceRemove("d", 1)
+
+	if got := tr.chunkDelta(); got != 2 {
+		t.Errorf("chunkDelta after 2 events = %d, want 2", got)
+	}
+
+	// Cumulative counters keep climbing.
+	if got := tr.snapshot().Total(); got != 4 {
+		t.Errorf("Total = %d, want 4 (cumulative across baseline reset)", got)
+	}
+}
+
+func TestMismatchTracker_ConcurrentRecord_NoLostUpdates(t *testing.T) {
+	t.Parallel()
+
+	var (
+		tr        mismatchTracker
+		wg        sync.WaitGroup
+		perWorker = int64(500)
+		workers   = 8
+	)
+
+	wg.Add(workers)
+
+	for range workers {
+		go func() {
+			defer wg.Done()
+
+			for range int(perWorker) {
+				tr.recordReset("x", 1)
+			}
+		}()
+	}
+
+	wg.Wait()
+
+	want := perWorker * int64(workers)
+	if got := tr.snapshot().Resets; got != want {
+		t.Errorf("Resets = %d, want %d (concurrent atomic updates must not lose any)", got, want)
+	}
+
+	// At most one log per interval; bound the number that could have won
+	// the slot during this short test (a few, definitely not all).
+	logged := tr.snapshot().Resets - tr.dropped.Load()
+	if logged < 1 {
+		t.Errorf("logged events = %d, want at least 1", logged)
+	}
+
+	if logged > want {
+		t.Errorf("logged events = %d > total = %d, dropped count is broken", logged, want)
+	}
+}
+
+func TestMismatchStats_Total_SumsBothCounters(t *testing.T) {
+	t.Parallel()
+
+	s := MismatchStats{Resets: 7, ForceRemoves: 3}
+
+	if got := s.Total(); got != 10 {
+		t.Errorf("Total = %d, want 10", got)
+	}
+}
diff --git a/internal/analyzers/burndown/store_writer_test.go b/internal/analyzers/burndown/store_writer_test.go
index a261725..8294877 100644
--- a/internal/analyzers/burndown/store_writer_test.go
+++ b/internal/analyzers/burndown/store_writer_test.go
@@ -1,7 +1,5 @@
 package burndown
 
-// FRD: specs/frds/FRD-20260301-burndown-filehistory-store-writer.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/clones/aggregator.go b/internal/analyzers/clones/aggregator.go
index 0a95680..4f74d4b 100644
--- a/internal/analyzers/clones/aggregator.go
+++ b/internal/analyzers/clones/aggregator.go
@@ -99,14 +99,14 @@ func (a *Aggregator) GetResult() analyze.Report {
 		return buildEmptyReport(msgNoFunctions)
 	}
 
-	pairs, totalCount := a.detectGlobalClones()
+	result := a.detectGlobalClones()
 
-	cloneRatio := computeCloneRatio(totalCount, a.totalFunctions)
-	message := cloneMessage(totalCount)
+	cloneRatio := computeCloneRatio(len(result.clonedFunc), a.totalFunctions)
+	message := cloneMessage(result.totalCount)
 
-	pairsForReport := make([]map[string]any, 0, len(pairs))
+	pairsForReport := make([]map[string]any, 0, len(result.pairs))
 
-	for _, p := range pairs {
+	for _, p := range result.pairs {
 		pairsForReport = append(pairsForReport, map[string]any{
 			"func_a":     p.FuncA,
 			"func_b":     p.FuncB,
@@ -116,25 +116,25 @@ func (a *Aggregator) GetResult() analyze.Report {
 	}
 
 	return analyze.Report{
-		keyAnalyzerName:    analyzerName,
-		keyTotalFunctions:  a.totalFunctions,
-		keyTotalClonePairs: totalCount,
-		keyCloneRatio:      cloneRatio,
-		keyClonePairs:      pairsForReport,
-		keyMessage:         message,
+		keyAnalyzerName:          analyzerName,
+		keyTotalFunctions:        a.totalFunctions,
+		keyTotalClonePairs:       result.totalCount,
+		keyCloneRatio:            cloneRatio,
+		keyClonePairs:            pairsForReport,
+		keyCloneTypeDistribution: cloneTypeDistMap(result.typeDistribution),
+		keyMessage:               message,
 	}
 }
 
 // detectGlobalClones builds a single LSH index from all entries and finds clone pairs.
-// Returns the (possibly capped) pairs slice and the exact total count of all pairs found.
-func (a *Aggregator) detectGlobalClones() (pairs []ClonePair, totalCount int) {
+func (a *Aggregator) detectGlobalClones() clonePairResult {
 	if len(a.entries) == 0 {
-		return nil, 0
+		return clonePairResult{}
 	}
 
 	idx, err := lsh.New(a.NumBands, a.NumRows)
 	if err != nil {
-		return nil, 0
+		return clonePairResult{}
 	}
 
 	for _, entry := range a.entries {
diff --git a/internal/analyzers/clones/analyzer.go b/internal/analyzers/clones/analyzer.go
index 3cf10f3..181b0ca 100644
--- a/internal/analyzers/clones/analyzer.go
+++ b/internal/analyzers/clones/analyzer.go
@@ -4,6 +4,7 @@ import (
 	"encoding/json"
 	"fmt"
 	"io"
+	"strings"
 
 	"gopkg.in/yaml.v3"
 
@@ -29,6 +30,13 @@ const (
 	// numRows is the number of rows per LSH band.
 	numRows = 8
 
+	// minFunctionNodes is the minimum number of AST nodes a function must have
+	// to be included in clone detection. Functions below this threshold are
+	// trivial (getters, setters, return-nil stubs) and produce false positives
+	// because their minimal AST structure hashes identically regardless of purpose.
+	// Empirical: getters ≈ 13-15 nodes, setters ≈ 19, real logic ≥ 25.
+	minFunctionNodes = 20
+
 	// analyzerName is the registered name of the clone detection analyzer.
 	analyzerName = "clones"
 
@@ -315,9 +323,9 @@ func (a *Analyzer) detectClones(functions []*node.Node) []ClonePair {
 	}
 
 	// Per-file detection: no cap (single-file scope, bounded by function count).
-	pairs, _ := findClonePairs(entries, idx, 0, a.cfgSimilarityType3)
+	result := findClonePairs(entries, idx, 0, a.cfgSimilarityType3)
 
-	return pairs
+	return result.pairs
 }
 
 // buildSignatures computes MinHash signatures for all functions.
@@ -325,6 +333,10 @@ func (a *Analyzer) buildSignatures(functions []*node.Node) []funcEntry {
 	entries := make([]funcEntry, 0, len(functions))
 
 	for _, fn := range functions {
+		if countNodes(fn) < minFunctionNodes {
+			continue
+		}
+
 		shingles := a.shingler.ExtractShingles(fn)
 		if len(shingles) == 0 {
 			continue
@@ -350,22 +362,86 @@ func (a *Analyzer) buildSignatures(functions []*node.Node) []funcEntry {
 	return entries
 }
 
-// extractFuncName extracts the function name from a node.
+// extractFuncName extracts a unique function name from a node.
+// For methods, qualifies with the receiver type (e.g., "Foo.DoWork") to avoid
+// collisions in the LSH index when different types share the same method name.
 func extractFuncName(fn *node.Node) string {
-	if name, ok := common.ExtractEntityName(fn); ok && name != "" {
-		return name
+	name, ok := common.ExtractEntityName(fn)
+	if !ok || name == "" {
+		if fn.Token != "" {
+			name = fn.Token
+		} else {
+			name = string(fn.Type)
+		}
+	}
+
+	if fn.Type == node.UASTMethod {
+		if recv := extractReceiverType(fn); recv != "" {
+			return recv + "." + name
+		}
+	}
+
+	return name
+}
+
+// extractReceiverType extracts the receiver type name from a Method node.
+// The UAST represents the receiver as the first Parameter child with a token
+// like "(f *Foo)" or "(f Foo)".
+func extractReceiverType(fn *node.Node) string {
+	for _, child := range fn.Children {
+		if !child.HasAnyRole(node.RoleParameter) {
+			continue
+		}
+
+		// The receiver parameter token contains the full "(name *Type)" text.
+		tok := child.Token
+		if tok == "" {
+			continue
+		}
+
+		// Extract the type name: strip parens, pointer star, and variable name.
+		// Strip parens, pointer star, and variable name to extract the type.
+		tok = strings.TrimPrefix(tok, "(")
+		tok = strings.TrimSuffix(tok, ")")
+		tok = strings.TrimSpace(tok)
+
+		// Split "f *Foo" into parts, take the last one (the type).
+		parts := strings.Fields(tok)
+		// Receiver has at least two parts: variable name and type.
+		const minReceiverParts = 2
+		if len(parts) < minReceiverParts {
+			continue
+		}
+
+		typeName := parts[len(parts)-1]
+		typeName = strings.TrimPrefix(typeName, "*")
+
+		if typeName != "" {
+			return typeName
+		}
+	}
+
+	return ""
+}
+
+// countNodes returns the total number of nodes in a subtree.
+func countNodes(n *node.Node) int {
+	if n == nil {
+		return 0
 	}
 
-	if fn.Token != "" {
-		return fn.Token
+	count := 1
+
+	for _, child := range n.Children {
+		count += countNodes(child)
 	}
 
-	return string(fn.Type)
+	return count
 }
 
 // buildReport constructs the analysis report.
 func (a *Analyzer) buildReport(totalFunctions int, pairs []ClonePair) analyze.Report {
-	cloneRatio := computeCloneRatio(len(pairs), totalFunctions)
+	cloneRatio := computeCloneRatio(countDistinctFuncs(pairs), totalFunctions)
 	message := cloneMessage(len(pairs))
 
 	pairsForReport := make([]map[string]any, 0, len(pairs))
@@ -380,12 +456,13 @@ func (a *Analyzer) buildReport(totalFunctions int, pairs []ClonePair) analyze.Re
 	}
 
 	return analyze.Report{
-		keyAnalyzerName:    analyzerName,
-		keyTotalFunctions:  totalFunctions,
-		keyTotalClonePairs: len(pairs),
-		keyCloneRatio:      cloneRatio,
-		keyClonePairs:      pairsForReport,
-		keyMessage:         message,
+		keyAnalyzerName:          analyzerName,
+		keyTotalFunctions:        totalFunctions,
+		keyTotalClonePairs:       len(pairs),
+		keyCloneRatio:            cloneRatio,
+		keyClonePairs:            pairsForReport,
+		keyCloneTypeDistribution: cloneTypeDistMap(categorizeClonePairs(pairs)),
+		keyMessage:               message,
 	}
 }
 
@@ -400,13 +477,26 @@ func buildEmptyReport(message string) analyze.Report {
 	})
 }
 
-// computeCloneRatio calculates the ratio of clone pairs to total functions.
-func computeCloneRatio(pairCount, totalFunctions int) float64 {
-	if totalFunctions == 0 {
+// countDistinctFuncs returns the number of unique function names across all pairs.
+func countDistinctFuncs(pairs []ClonePair) int {
+	unique := make(map[string]struct{}, len(pairs))
+
+	for idx := range pairs {
+		unique[pairs[idx].FuncA] = struct{}{}
+		unique[pairs[idx].FuncB] = struct{}{}
+	}
+
+	return len(unique)
+}
+
+// computeCloneRatio calculates the fraction of functions involved in at least one clone pair.
+// Returns a value in [0, 1]: 0 means no duplication, 1 means every function has a clone.
+func computeCloneRatio(clonedFuncs, totalFunctions int) float64 {
+	if totalFunctions == 0 || clonedFuncs == 0 {
 		return 0.0
 	}
 
-	return float64(pairCount) / float64(totalFunctions)
+	return float64(clonedFuncs) / float64(totalFunctions)
 }
 
 // cloneMessage returns a human-readable message based on clone pair count.
diff --git a/internal/analyzers/clones/analyzer_test.go b/internal/analyzers/clones/analyzer_test.go
index 3ed0fc3..e589153 100644
--- a/internal/analyzers/clones/analyzer_test.go
+++ b/internal/analyzers/clones/analyzer_test.go
@@ -30,10 +30,17 @@ func buildFunctionNode(name string, childTypes []node.Type) *node.Node {
 		WithRoles([]node.Role{node.RoleFunction, node.RoleDeclaration}).
 		Build()
 
+	// Build nested subtrees so the total node count exceeds minFunctionNodes.
+	// Each child gets 2 sub-children to produce realistic AST depth.
 	children := make([]*node.Node, 0, len(childTypes))
 
-	for _, ct := range childTypes {
+	for i, ct := range childTypes {
 		child := node.NewBuilder().WithType(ct).Build()
+
+		sub1 := node.NewBuilder().WithType(childTypes[i%len(childTypes)]).Build()
+		sub2 := node.NewBuilder().WithType(childTypes[(i+1)%len(childTypes)]).Build()
+		child.Children = []*node.Node{sub1, sub2}
+
 		children = append(children, child)
 	}
 
@@ -407,9 +414,9 @@ func TestShingler_ExtractShingles_Valid(t *testing.T) {
 	shingles := s.ExtractShingles(fn)
 	require.NotNil(t, shingles)
 
-	// Function node itself + 8 children = 9 nodes.
-	// With k=5: 9 - 5 + 1 = 5 shingles.
-	assert.Len(t, shingles, defaultShingleSize)
+	// Function node + 8 children × 3 nodes each = 25 nodes.
+	// With k=5: 25 - 5 + 1 = 21 shingles.
+	assert.Len(t, shingles, 21)
 }
 
 // TestShingler_ExtractShingles_Deterministic verifies same tree produces same shingles.
@@ -450,13 +457,18 @@ func TestClonePairKey(t *testing.T) {
 	assert.Equal(t, key1, key2)
 }
 
-// TestComputeCloneRatio verifies ratio computation.
+// TestComputeCloneRatio verifies ratio = distinct cloned functions / total functions.
 func TestComputeCloneRatio(t *testing.T) {
 	t.Parallel()
 
 	assert.InDelta(t, 0.0, computeCloneRatio(0, 0), testFloatDelta)
 	assert.InDelta(t, 0.0, computeCloneRatio(0, 10), testFloatDelta)
-	assert.InDelta(t, 0.5, computeCloneRatio(5, 10), testFloatDelta)
+
+	// 2 distinct cloned functions out of 10 → 0.2.
+	assert.InDelta(t, 0.2, computeCloneRatio(2, 10), testFloatDelta)
+
+	// 4 out of 4 → 1.0.
+	assert.InDelta(t, 1.0, computeCloneRatio(4, 4), testFloatDelta)
 }
 
 // TestCloneMessage verifies message selection.
@@ -801,10 +813,10 @@ func TestAggregator_RecomputedCloneRatio(t *testing.T) {
 	result := agg.GetResult()
 	assert.Equal(t, 3, result[keyTotalFunctions])
 
-	// 1 clone pair / 3 functions = 0.333...
+	// 1 clone pair → 2 distinct cloned functions out of 3 → 2/3 ≈ 0.667.
 	ratio, ok := result[keyCloneRatio].(float64)
 	require.True(t, ok)
-	assert.InDelta(t, 1.0/3.0, ratio, testFloatDelta)
+	assert.InDelta(t, 2.0/3.0, ratio, testFloatDelta)
 }
 
 // TestAggregator_NoDedupByFuncA verifies multiple pairs sharing func_a name all appear.
@@ -924,8 +936,6 @@ func TestExtractFuncName(t *testing.T) {
 	assert.Equal(t, string(node.UASTFunction), extractFuncName(fn3))
 }
 
-// FRD: specs/frds/FRD-20260311-clones-pair-cap.md.
-
 // TestAggregator_MaxClonePairs_Default verifies NewAggregator sets default cap.
 func TestAggregator_MaxClonePairs_Default(t *testing.T) {
 	t.Parallel()
diff --git a/internal/analyzers/clones/benchmark_test.go b/internal/analyzers/clones/benchmark_test.go
index d1b0c5d..e6eaa3f 100644
--- a/internal/analyzers/clones/benchmark_test.go
+++ b/internal/analyzers/clones/benchmark_test.go
@@ -1,7 +1,5 @@
 package clones
 
-// FRD: specs/frds/FRD-20260311-clones-pair-cap.md.
-
 import (
 	"fmt"
 	"runtime"
diff --git a/internal/analyzers/clones/clone_ratio_fixture_test.go b/internal/analyzers/clones/clone_ratio_fixture_test.go
new file mode 100644
index 0000000..1ad5b3f
--- /dev/null
+++ b/internal/analyzers/clones/clone_ratio_fixture_test.go
@@ -0,0 +1,629 @@
+package clones
+
+import (
+	"context"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/pkg/uast"
+)
+
+// Fixture-based clone ratio tests validate that computeCloneRatio
+// (pairs / maxPossiblePairs) produces meaningful, bounded values
+// for known duplication patterns parsed through the real UAST pipeline.
+
+// parseAndAnalyze parses Go source through UAST and runs the clone analyzer.
+func parseAndAnalyze(t *testing.T, source string) analyze.Report {
+	t.Helper()
+
+	parser, err := uast.NewParser()
+	require.NoError(t, err)
+
+	root, parseErr := parser.Parse(context.Background(), "fixture.go", []byte(source))
+	require.NoError(t, parseErr)
+
+	analyzer := NewAnalyzer()
+
+	report, analyzeErr := analyzer.Analyze(root)
+	require.NoError(t, analyzeErr)
+
+	return report
+}
+
+// reportFuncs extracts the total function count from a clone report.
+func reportFuncs(t *testing.T, r analyze.Report) int {
+	t.Helper()
+
+	v, ok := r[keyTotalFunctions].(int)
+	require.True(t, ok, "report must contain int %s", keyTotalFunctions)
+
+	return v
+}
+
+// reportPairs extracts the total clone pair count from a clone report.
+func reportPairs(t *testing.T, r analyze.Report) int {
+	t.Helper()
+
+	v, ok := r[keyTotalClonePairs].(int)
+	require.True(t, ok, "report must contain int %s", keyTotalClonePairs)
+
+	return v
+}
+
+// reportRatio extracts the clone ratio from a clone report.
+func reportRatio(t *testing.T, r analyze.Report) float64 {
+	t.Helper()
+
+	v, ok := r[keyCloneRatio].(float64)
+	require.True(t, ok, "report must contain float64 %s", keyCloneRatio)
+
+	return v
+}
+
+// fixtureAllUnique contains 4 functions with completely different logic.
+// Expected: 0 clone pairs, ratio = 0.
+const fixtureAllUnique = `package fixture
+
+func Sum(nums []int) int {
+	total := 0
+	for _, n := range nums {
+		total += n
+	}
+	return total
+}
+
+func Reverse(s string) string {
+	runes := []rune(s)
+	for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
+		runes[i], runes[j] = runes[j], runes[i]
+	}
+	return string(runes)
+}
+
+func IsPrime(n int) bool {
+	if n < 2 {
+		return false
+	}
+	for i := 2; i*i <= n; i++ {
+		if n%i == 0 {
+			return false
+		}
+	}
+	return true
+}
+
+func Fibonacci(n int) int {
+	if n <= 1 {
+		return n
+	}
+	a, b := 0, 1
+	for i := 2; i <= n; i++ {
+		a, b = b, a+b
+	}
+	return b
+}
+`
+
+// fixtureAllIdentical contains 4 functions with identical bodies (Type-1 clones).
+// Expected: 6 clone pairs (C(4,2)=6), ratio = 1.0.
+const fixtureAllIdentical = `package fixture
+
+func ProcessA(data []int) int {
+	result := 0
+	for _, v := range data {
+		if v > 0 {
+			result += v * 2
+		} else {
+			result -= v
+		}
+	}
+	if result > 100 {
+		result = 100
+	}
+	return result
+}
+
+func ProcessB(data []int) int {
+	result := 0
+	for _, v := range data {
+		if v > 0 {
+			result += v * 2
+		} else {
+			result -= v
+		}
+	}
+	if result > 100 {
+		result = 100
+	}
+	return result
+}
+
+func ProcessC(data []int) int {
+	result := 0
+	for _, v := range data {
+		if v > 0 {
+			result += v * 2
+		} else {
+			result -= v
+		}
+	}
+	if result > 100 {
+		result = 100
+	}
+	return result
+}
+
+func ProcessD(data []int) int {
+	result := 0
+	for _, v := range data {
+		if v > 0 {
+			result += v * 2
+		} else {
+			result -= v
+		}
+	}
+	if result > 100 {
+		result = 100
+	}
+	return result
+}
+`
+
+// fixtureRenamedClones contains 3 functions: 2 are Type-2 clones (same AST
+// structure, different variable names), 1 is unique.
+const fixtureRenamedClones = `package fixture
+
+func CalcScore(items []int) int {
+	score := 0
+	for _, item := range items {
+		if item > 10 {
+			score += item * 3
+		} else {
+			score += item
+		}
+	}
+	if score > 1000 {
+		score = 1000
+	}
+	return score
+}
+
+func ComputeTotal(entries []int) int {
+	total := 0
+	for _, entry := range entries {
+		if entry > 10 {
+			total += entry * 3
+		} else {
+			total += entry
+		}
+	}
+	if total > 1000 {
+		total = 1000
+	}
+	return total
+}
+
+func FormatOutput(s string) string {
+	if len(s) == 0 {
+		return "<empty>"
+	}
+	return "[" + s + "]"
+}
+`
+
+// fixtureHalfClones contains 6 functions: 3 identical clones + 3 unique.
+// maxPairs = C(6,2) = 15, clone pairs among the 3 identical = C(3,2) = 3.
+const fixtureHalfClones = `package fixture
+
+func CloneA(data []int) int {
+	sum := 0
+	for i := 0; i < len(data); i++ {
+		if data[i] > 0 {
+			sum += data[i]
+		}
+	}
+	if sum > 500 {
+		sum = 500
+	}
+	return sum
+}
+
+func CloneB(data []int) int {
+	sum := 0
+	for i := 0; i < len(data); i++ {
+		if data[i] > 0 {
+			sum += data[i]
+		}
+	}
+	if sum > 500 {
+		sum = 500
+	}
+	return sum
+}
+
+func CloneC(data []int) int {
+	sum := 0
+	for i := 0; i < len(data); i++ {
+		if data[i] > 0 {
+			sum += data[i]
+		}
+	}
+	if sum > 500 {
+		sum = 500
+	}
+	return sum
+}
+
+func UniqueX(n int) bool {
+	if n < 2 {
+		return false
+	}
+	for i := 2; i*i <= n; i++ {
+		if n%i == 0 {
+			return false
+		}
+	}
+	return true
+}
+
+func UniqueY(s string) int {
+	count := 0
+	for _, r := range s {
+		if r >= 'a' && r <= 'z' {
+			count++
+		}
+	}
+	return count
+}
+
+func UniqueZ(a, b int) int {
+	for b != 0 {
+		a, b = b, a%b
+	}
+	return a
+}
+`
+
+func TestFixture_AllUnique_ZeroRatio(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureAllUnique)
+	require.Equal(t, 4, reportFuncs(t, report))
+
+	assert.InDelta(t, 0.0, reportRatio(t, report), 0.05,
+		"4 unique functions must produce near-zero clone ratio")
+}
+
+func TestFixture_AllIdentical_FullRatio(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureAllIdentical)
+	require.Equal(t, 4, reportFuncs(t, report))
+
+	assert.Equal(t, 6, reportPairs(t, report),
+		"4 identical functions must produce C(4,2)=6 clone pairs")
+	assert.InDelta(t, 1.0, reportRatio(t, report), 0.01,
+		"all-identical functions must produce ratio near 1.0")
+
+	section := NewReportSection(report)
+	assert.InDelta(t, 0.0, section.Score(), 0.01)
+}
+
+func TestFixture_RenamedClones_Detected(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureRenamedClones)
+	require.Equal(t, 3, reportFuncs(t, report))
+	assert.GreaterOrEqual(t, reportPairs(t, report), 1,
+		"Type-2 renamed clones must be detected")
+
+	ratio := reportRatio(t, report)
+	assert.Greater(t, ratio, 0.0, "renamed clones must produce non-zero ratio")
+	assert.LessOrEqual(t, ratio, 1.0, "ratio must be bounded to [0, 1]")
+}
+
+func TestFixture_HalfClones_PartialRatio(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureHalfClones)
+	require.Equal(t, 6, reportFuncs(t, report))
+	assert.GreaterOrEqual(t, reportPairs(t, report), 3,
+		"3 identical + 3 unique must produce at least 3 clone pairs")
+
+	ratio := reportRatio(t, report)
+	// 3 cloned functions out of 6 total → 0.5.
+	assert.InDelta(t, 0.5, ratio, 0.1, "ratio must reflect partial duplication")
+}
+
+func TestFixture_RatioBounded(t *testing.T) {
+	t.Parallel()
+
+	fixtures := map[string]string{
+		"all_unique":    fixtureAllUnique,
+		"all_identical": fixtureAllIdentical,
+		"renamed":       fixtureRenamedClones,
+		"half_clones":   fixtureHalfClones,
+	}
+
+	for name, source := range fixtures {
+		t.Run(name, func(t *testing.T) {
+			t.Parallel()
+
+			ratio := reportRatio(t, parseAndAnalyze(t, source))
+			assert.GreaterOrEqual(t, ratio, 0.0, "clone ratio must be >= 0")
+			assert.LessOrEqual(t, ratio, 1.0, "clone ratio must be <= 1")
+		})
+	}
+}
+
+func TestFixture_MonotonicOrdering(t *testing.T) {
+	t.Parallel()
+
+	ratioUnique := reportRatio(t, parseAndAnalyze(t, fixtureAllUnique))
+	ratioHalf := reportRatio(t, parseAndAnalyze(t, fixtureHalfClones))
+	ratioFull := reportRatio(t, parseAndAnalyze(t, fixtureAllIdentical))
+
+	assert.Less(t, ratioUnique, ratioHalf, "unique < half-cloned")
+	assert.Less(t, ratioHalf, ratioFull, "half-cloned < fully-cloned")
+}
+
+// Kubernetes-derived fixtures: real patterns from kubernetes/kubernetes
+// adapted to be self-contained. Validates detection on production-grade code.
+
+// fixtureK8sValidation is adapted from pkg/apis/rbac/validation.
+// ValidateRoleBinding and ValidateClusterRoleBinding are near-identical.
+const fixtureK8sValidation = `package fixture
+
+type ErrorList []string
+type ObjectMeta struct{ Name string }
+
+type Ref struct{ APIGroup, Kind, Name string }
+type Subject struct{ Name string }
+type RoleBinding struct{ ObjectMeta; Role Ref; Subjects []Subject }
+type ClusterRoleBinding struct{ ObjectMeta; Role Ref; Subjects []Subject }
+
+func ValidateRoleBinding(rb *RoleBinding) ErrorList {
+	allErrs := ErrorList{}
+	if rb.ObjectMeta.Name == "" {
+		allErrs = append(allErrs, "metadata.name is required")
+	}
+	if rb.Role.APIGroup != "rbac.authorization.k8s.io" {
+		allErrs = append(allErrs, "roleRef.apiGroup not supported")
+	}
+	switch rb.Role.Kind {
+	case "Role", "ClusterRole":
+	default:
+		allErrs = append(allErrs, "roleRef.kind not supported")
+	}
+	if len(rb.Role.Name) == 0 {
+		allErrs = append(allErrs, "roleRef.name is required")
+	}
+	for _, subject := range rb.Subjects {
+		if subject.Name == "" {
+			allErrs = append(allErrs, "subject.name is required")
+		}
+	}
+	return allErrs
+}
+
+func ValidateClusterRoleBinding(rb *ClusterRoleBinding) ErrorList {
+	allErrs := ErrorList{}
+	if rb.ObjectMeta.Name == "" {
+		allErrs = append(allErrs, "metadata.name is required")
+	}
+	if rb.Role.APIGroup != "rbac.authorization.k8s.io" {
+		allErrs = append(allErrs, "roleRef.apiGroup not supported")
+	}
+	switch rb.Role.Kind {
+	case "ClusterRole":
+	default:
+		allErrs = append(allErrs, "roleRef.kind not supported")
+	}
+	if len(rb.Role.Name) == 0 {
+		allErrs = append(allErrs, "roleRef.name is required")
+	}
+	for _, subject := range rb.Subjects {
+		if subject.Name == "" {
+			allErrs = append(allErrs, "subject.name is required")
+		}
+	}
+	return allErrs
+}
+
+func ValidateRoleBindingUpdate(rb *RoleBinding, old *RoleBinding) ErrorList {
+	allErrs := ValidateRoleBinding(rb)
+	if old.Role != rb.Role {
+		allErrs = append(allErrs, "cannot change roleRef")
+	}
+	return allErrs
+}
+
+func ValidateClusterRoleBindingUpdate(rb *ClusterRoleBinding, old *ClusterRoleBinding) ErrorList {
+	allErrs := ValidateClusterRoleBinding(rb)
+	if old.Role != rb.Role {
+		allErrs = append(allErrs, "cannot change roleRef")
+	}
+	return allErrs
+}
+`
+
+// fixtureK8sEventHandlers is adapted from client-go/tools/cache/controller.go.
+// Three receiver types implement OnAdd/OnUpdate/OnDelete.
+const fixtureK8sEventHandlers = `package fixture
+
+type ResourceEventHandlerFuncs struct {
+	AddFunc    func(obj interface{})
+	UpdateFunc func(oldObj, newObj interface{})
+	DeleteFunc func(obj interface{})
+}
+
+func (r ResourceEventHandlerFuncs) OnAdd(obj interface{}, isInInitialList bool) {
+	if r.AddFunc != nil {
+		r.AddFunc(obj)
+	}
+}
+
+func (r ResourceEventHandlerFuncs) OnUpdate(oldObj, newObj interface{}) {
+	if r.UpdateFunc != nil {
+		r.UpdateFunc(oldObj, newObj)
+	}
+}
+
+func (r ResourceEventHandlerFuncs) OnDelete(obj interface{}) {
+	if r.DeleteFunc != nil {
+		r.DeleteFunc(obj)
+	}
+}
+
+type ResourceEventHandlerDetailedFuncs struct {
+	AddFunc    func(obj interface{}, isInInitialList bool)
+	UpdateFunc func(oldObj, newObj interface{})
+	DeleteFunc func(obj interface{})
+}
+
+func (r ResourceEventHandlerDetailedFuncs) OnAdd(obj interface{}, isInInitialList bool) {
+	if r.AddFunc != nil {
+		r.AddFunc(obj, isInInitialList)
+	}
+}
+
+func (r ResourceEventHandlerDetailedFuncs) OnUpdate(oldObj, newObj interface{}) {
+	if r.UpdateFunc != nil {
+		r.UpdateFunc(oldObj, newObj)
+	}
+}
+
+func (r ResourceEventHandlerDetailedFuncs) OnDelete(obj interface{}) {
+	if r.DeleteFunc != nil {
+		r.DeleteFunc(obj)
+	}
+}
+
+type FilteringResourceEventHandler struct {
+	FilterFunc func(obj interface{}) bool
+	Handler    interface{ OnAdd(interface{}, bool); OnUpdate(interface{}, interface{}); OnDelete(interface{}) }
+}
+
+func (r FilteringResourceEventHandler) OnAdd(obj interface{}, isInInitialList bool) {
+	if !r.FilterFunc(obj) {
+		return
+	}
+	r.Handler.OnAdd(obj, isInInitialList)
+}
+
+func (r FilteringResourceEventHandler) OnUpdate(oldObj, newObj interface{}) {
+	newer := r.FilterFunc(newObj)
+	older := r.FilterFunc(oldObj)
+	switch {
+	case newer && older:
+		r.Handler.OnUpdate(oldObj, newObj)
+	case newer && !older:
+		r.Handler.OnAdd(newObj, false)
+	case !newer && older:
+		r.Handler.OnDelete(oldObj)
+	}
+}
+
+func (r FilteringResourceEventHandler) OnDelete(obj interface{}) {
+	if !r.FilterFunc(obj) {
+		return
+	}
+	r.Handler.OnDelete(obj)
+}
+`
+
+// fixtureK8sDeepCopy is adapted from zz_generated.deepcopy.go files.
+// Machine-generated DeepCopyInto methods on different receiver types.
+const fixtureK8sDeepCopy = `package fixture
+
+type TokenConfig struct{ Token, TTL, Expires *int64; Usages, Groups []string }
+type SecretConfig struct{ Name, TTL, Expires *int64; Labels, Scopes []string }
+type CertConfig struct{ Issuer, TTL, Expires *int64; SANs, Orgs []string }
+
+func (in *TokenConfig) DeepCopyInto(out *TokenConfig) {
+	*out = *in
+	if in.Token != nil { cp := *in.Token; out.Token = &cp }
+	if in.TTL != nil { cp := *in.TTL; out.TTL = &cp }
+	if in.Expires != nil { cp := *in.Expires; out.Expires = &cp }
+	if in.Usages != nil { out.Usages = make([]string, len(in.Usages)); copy(out.Usages, in.Usages) }
+	if in.Groups != nil { out.Groups = make([]string, len(in.Groups)); copy(out.Groups, in.Groups) }
+}
+
+func (in *SecretConfig) DeepCopyInto(out *SecretConfig) {
+	*out = *in
+	if in.Name != nil { cp := *in.Name; out.Name = &cp }
+	if in.TTL != nil { cp := *in.TTL; out.TTL = &cp }
+	if in.Expires != nil { cp := *in.Expires; out.Expires = &cp }
+	if in.Labels != nil { out.Labels = make([]string, len(in.Labels)); copy(out.Labels, in.Labels) }
+	if in.Scopes != nil { out.Scopes = make([]string, len(in.Scopes)); copy(out.Scopes, in.Scopes) }
+}
+
+func (in *CertConfig) DeepCopyInto(out *CertConfig) {
+	*out = *in
+	if in.Issuer != nil { cp := *in.Issuer; out.Issuer = &cp }
+	if in.TTL != nil { cp := *in.TTL; out.TTL = &cp }
+	if in.Expires != nil { cp := *in.Expires; out.Expires = &cp }
+	if in.SANs != nil { out.SANs = make([]string, len(in.SANs)); copy(out.SANs, in.SANs) }
+	if in.Orgs != nil { out.Orgs = make([]string, len(in.Orgs)); copy(out.Orgs, in.Orgs) }
+}
+`
+
+func TestFixtureK8s_Validation_DetectsClonePairs(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureK8sValidation)
+	require.Equal(t, 4, reportFuncs(t, report))
+	assert.GreaterOrEqual(t, reportPairs(t, report), 2,
+		"RBAC validation clones must produce at least 2 clone pairs")
+
+	ratio := reportRatio(t, report)
+	assert.Greater(t, ratio, 0.0)
+	assert.LessOrEqual(t, ratio, 1.0)
+}
+
+func TestFixtureK8s_EventHandlers_DetectsClones(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureK8sEventHandlers)
+	assert.GreaterOrEqual(t, reportFuncs(t, report), 9)
+	assert.GreaterOrEqual(t, reportPairs(t, report), 1,
+		"identical handler methods across receiver types must be detected")
+
+	ratio := reportRatio(t, report)
+	assert.Greater(t, ratio, 0.0, "event handler clones must produce non-zero ratio")
+	assert.LessOrEqual(t, ratio, 1.0)
+}
+
+func TestFixtureK8s_DeepCopy_HighCloneRatio(t *testing.T) {
+	t.Parallel()
+
+	report := parseAndAnalyze(t, fixtureK8sDeepCopy)
+	require.Equal(t, 3, reportFuncs(t, report))
+	assert.Equal(t, 3, reportPairs(t, report),
+		"3 identical DeepCopyInto methods must produce C(3,2)=3 clone pairs")
+	assert.InDelta(t, 1.0, reportRatio(t, report), 0.01,
+		"all-identical DeepCopyInto methods must produce ratio near 1.0")
+}
+
+func TestFixtureK8s_AllBounded(t *testing.T) {
+	t.Parallel()
+
+	fixtures := map[string]string{
+		"validation":     fixtureK8sValidation,
+		"event_handlers": fixtureK8sEventHandlers,
+		"deepcopy":       fixtureK8sDeepCopy,
+	}
+
+	for name, source := range fixtures {
+		t.Run(name, func(t *testing.T) {
+			t.Parallel()
+
+			ratio := reportRatio(t, parseAndAnalyze(t, source))
+			assert.GreaterOrEqual(t, ratio, 0.0, "clone ratio must be >= 0")
+			assert.LessOrEqual(t, ratio, 1.0, "clone ratio must be <= 1")
+		})
+	}
+}
diff --git a/internal/analyzers/clones/report.go b/internal/analyzers/clones/report.go
index 2eb93f7..6deb751 100644
--- a/internal/analyzers/clones/report.go
+++ b/internal/analyzers/clones/report.go
@@ -32,13 +32,14 @@ const DefaultMaxClonePairs = 1000
 
 // Report keys.
 const (
-	keyAnalyzerName    = "analyzer_name"
-	keyTotalClonePairs = "total_clone_pairs"
-	keyClonePairs      = "clone_pairs"
-	keyTotalFunctions  = "total_functions"
-	keyMessage         = "message"
-	keyCloneRatio      = "clone_ratio"
-	keyFuncSignatures  = "_func_signatures"
+	keyAnalyzerName          = "analyzer_name"
+	keyTotalClonePairs       = "total_clone_pairs"
+	keyClonePairs            = "clone_pairs"
+	keyTotalFunctions        = "total_functions"
+	keyMessage               = "message"
+	keyCloneRatio            = "clone_ratio"
+	keyFuncSignatures        = "_func_signatures"
+	keyCloneTypeDistribution = "clone_type_distribution"
 )
 
 // ClonePair represents a detected clone relationship between two functions.
@@ -51,11 +52,12 @@ type ClonePair struct {
 
 // ComputedMetrics holds computed clone detection metrics for JSON/YAML/binary export.
 type ComputedMetrics struct {
-	TotalFunctions  int         `json:"total_functions"   yaml:"total_functions"`
-	TotalClonePairs int         `json:"total_clone_pairs" yaml:"total_clone_pairs"`
-	CloneRatio      float64     `json:"clone_ratio"       yaml:"clone_ratio"`
-	ClonePairs      []ClonePair `json:"clone_pairs"       yaml:"clone_pairs"`
-	Message         string      `json:"message"           yaml:"message"`
+	TotalFunctions  int            `json:"total_functions"                   yaml:"total_functions"`
+	TotalClonePairs int            `json:"total_clone_pairs"                 yaml:"total_clone_pairs"`
+	CloneRatio      float64        `json:"clone_ratio"                       yaml:"clone_ratio"`
+	CloneTypeDist   map[string]int `json:"clone_type_distribution,omitempty" yaml:"clone_type_distribution,omitempty"`
+	ClonePairs      []ClonePair    `json:"clone_pairs"                       yaml:"clone_pairs"`
+	Message         string         `json:"message"                           yaml:"message"`
 }
 
 // cloneTypeClassifier classifies clone similarity into clone types.
@@ -115,6 +117,10 @@ func computeMetricsFromReport(report map[string]any) *ComputedMetrics {
 
 	metrics.ClonePairs = extractClonePairs(report)
 
+	if v, ok := report[keyCloneTypeDistribution].(map[string]int); ok {
+		metrics.CloneTypeDist = v
+	}
+
 	return metrics
 }
 
diff --git a/internal/analyzers/clones/report_section.go b/internal/analyzers/clones/report_section.go
index 6af3762..b53bc7d 100644
--- a/internal/analyzers/clones/report_section.go
+++ b/internal/analyzers/clones/report_section.go
@@ -50,13 +50,18 @@ func NewReportSection(report analyze.Report) *ReportSection {
 }
 
 // computeScore converts clone ratio to a 0-1 score (lower ratio = higher score).
+// Clone ratio is pairs/functions which can exceed 1.0 (quadratic pair growth),
+// so we clamp to [0, 1] before inverting.
 func computeScore(cloneRatio float64) float64 {
-	score := 1.0 - cloneRatio
-	if score < 0 {
+	if cloneRatio >= 1.0 {
 		return 0.0
 	}
 
-	return score
+	if cloneRatio <= 0.0 {
+		return 1.0
+	}
+
+	return 1.0 - cloneRatio
 }
 
 // KeyMetrics returns ordered key metrics for display.
@@ -69,15 +74,13 @@ func (s *ReportSection) KeyMetrics() []analyze.Metric {
 }
 
 // Distribution returns clone type distribution data.
+// Uses the full-population distribution when available, falling back to the capped pairs array.
 func (s *ReportSection) Distribution() []analyze.DistributionItem {
-	pairs := extractClonePairs(s.report)
-	if len(pairs) == 0 {
+	counts, total := s.extractDistribution()
+	if total == 0 {
 		return nil
 	}
 
-	counts := categorizeClonePairs(pairs)
-	total := len(pairs)
-
 	return []analyze.DistributionItem{
 		{Label: distLabelType1, Percent: reportutil.Pct(counts.type1, total), Count: counts.type1},
 		{Label: distLabelType2, Percent: reportutil.Pct(counts.type2, total), Count: counts.type2},
@@ -85,6 +88,22 @@ func (s *ReportSection) Distribution() []analyze.DistributionItem {
 	}
 }
 
+func (s *ReportSection) extractDistribution() (counts cloneTypeCounts, total int) {
+	if dist, ok := s.report[keyCloneTypeDistribution].(map[string]int); ok {
+		counts = cloneTypeCounts{
+			type1: dist[CloneType1],
+			type2: dist[CloneType2],
+			type3: dist[CloneType3],
+		}
+
+		return counts, counts.type1 + counts.type2 + counts.type3
+	}
+
+	pairs := extractClonePairs(s.report)
+
+	return categorizeClonePairs(pairs), len(pairs)
+}
+
 // cloneTypeCounts holds counts per clone type.
 type cloneTypeCounts struct {
 	type1 int
@@ -92,6 +111,27 @@ type cloneTypeCounts struct {
 	type3 int
 }
 
+// increment adds one to the counter for the given clone type.
+func (c *cloneTypeCounts) increment(cloneType string) {
+	switch cloneType {
+	case CloneType1:
+		c.type1++
+	case CloneType2:
+		c.type2++
+	case CloneType3:
+		c.type3++
+	}
+}
+
+// cloneTypeDistMap converts counts to a string-keyed map for JSON serialization.
+func cloneTypeDistMap(c cloneTypeCounts) map[string]int {
+	return map[string]int{
+		CloneType1: c.type1,
+		CloneType2: c.type2,
+		CloneType3: c.type3,
+	}
+}
+
 // categorizeClonePairs counts clone pairs by type.
 func categorizeClonePairs(pairs []ClonePair) cloneTypeCounts {
 	counts := cloneTypeCounts{}
diff --git a/internal/analyzers/clones/report_section_test.go b/internal/analyzers/clones/report_section_test.go
index 98b2b70..7447b2a 100644
--- a/internal/analyzers/clones/report_section_test.go
+++ b/internal/analyzers/clones/report_section_test.go
@@ -45,6 +45,17 @@ func TestCloneSection_Score(t *testing.T) {
 	assert.InDelta(t, 0.7, s.Score(), 1e-9)
 }
 
+func TestCloneSection_Score_HighRatio(t *testing.T) {
+	t.Parallel()
+
+	// Clone ratio can exceed 1.0 (pairs grow quadratically).
+	// 93.6 pairs/function → score must clamp to 0.0, not go negative.
+	s := NewReportSection(analyze.Report{
+		keyCloneRatio: 93.6,
+	})
+	assert.InDelta(t, 0.0, s.Score(), 1e-9)
+}
+
 func TestCloneSection_StatusMessage(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/clones/visitor.go b/internal/analyzers/clones/visitor.go
index 9292c01..5a9fdc1 100644
--- a/internal/analyzers/clones/visitor.go
+++ b/internal/analyzers/clones/visitor.go
@@ -59,6 +59,10 @@ func (v *Visitor) buildSignatures() []funcEntry {
 	entries := make([]funcEntry, 0, len(v.functions))
 
 	for _, fn := range v.functions {
+		if countNodes(fn) < minFunctionNodes {
+			continue
+		}
+
 		shingles := v.shingler.ExtractShingles(fn)
 		if len(shingles) == 0 {
 			continue
@@ -110,9 +114,18 @@ func buildSignatureReport(totalFunctions int, entries []funcEntry) analyze.Repor
 // findClonePairs queries the LSH index and collects unique clone pairs.
 // pairCap limits the stored pairs slice (0 = unlimited). The returned totalCount
 // reflects ALL unique pairs found, regardless of the cap.
-func findClonePairs(entries []funcEntry, idx *lsh.Index, pairCap int, minSimilarity float64) (pairs []ClonePair, totalCount int) {
+// clonePairResult holds the output of findClonePairs.
+type clonePairResult struct {
+	pairs            []ClonePair
+	totalCount       int
+	typeDistribution cloneTypeCounts
+	clonedFunc       map[string]struct{} // distinct function names involved in any pair.
+}
+
+func findClonePairs(entries []funcEntry, idx *lsh.Index, pairCap int, minSimilarity float64) clonePairResult {
 	seen := make(map[PairKey]bool)
 	sigMap := buildSignatureMap(entries)
+	result := clonePairResult{clonedFunc: make(map[string]struct{})}
 
 	for _, entry := range entries {
 		candidates, err := idx.QueryThreshold(entry.sig, minSimilarity)
@@ -120,14 +133,14 @@ func findClonePairs(entries []funcEntry, idx *lsh.Index, pairCap int, minSimilar
 			continue
 		}
 
-		pairs, totalCount = matchCandidates(entry, candidates, sigMap, seen, pairs, totalCount, pairCap, minSimilarity)
+		result = matchCandidates(entry, candidates, sigMap, seen, result, pairCap, minSimilarity)
 	}
 
-	sort.Slice(pairs, func(i, j int) bool {
-		return pairs[i].Similarity > pairs[j].Similarity
+	sort.Slice(result.pairs, func(i, j int) bool {
+		return result.pairs[i].Similarity > result.pairs[j].Similarity
 	})
 
-	return pairs, totalCount
+	return result
 }
 
 // buildSignatureMap creates a name-to-signature lookup from entries.
@@ -148,11 +161,10 @@ func matchCandidates(
 	candidates []string,
 	sigMap map[string]*minhash.Signature,
 	seen map[PairKey]bool,
-	pairs []ClonePair,
-	totalCount int,
+	result clonePairResult,
 	pairCap int,
 	minSimilarity float64,
-) (updatedPairs []ClonePair, updatedCount int) {
+) clonePairResult {
 	for _, candidateID := range candidates {
 		if candidateID == entry.name {
 			continue
@@ -167,15 +179,18 @@ func matchCandidates(
 
 		pair, ok := computeClonePair(entry, candidateID, sigMap, minSimilarity)
 		if ok {
-			totalCount++
+			result.totalCount++
+			result.typeDistribution.increment(pair.CloneType)
+			result.clonedFunc[pair.FuncA] = struct{}{}
+			result.clonedFunc[pair.FuncB] = struct{}{}
 
-			if pairCap <= 0 || len(pairs) < pairCap {
-				pairs = append(pairs, pair)
+			if pairCap <= 0 || len(result.pairs) < pairCap {
+				result.pairs = append(result.pairs, pair)
 			}
 		}
 	}
 
-	return pairs, totalCount
+	return result
 }
 
 // computeClonePair computes a clone pair between an entry and a candidate.
diff --git a/internal/analyzers/cohesion/aggregator.go b/internal/analyzers/cohesion/aggregator.go
index deb2766..73224ee 100644
--- a/internal/analyzers/cohesion/aggregator.go
+++ b/internal/analyzers/cohesion/aggregator.go
@@ -15,6 +15,7 @@ const (
 // Aggregator aggregates results from multiple cohesion analyses.
 type Aggregator struct {
 	*common.Aggregator
+	common.PerFileRetainer
 }
 
 // NewAggregator creates a new Aggregator.
@@ -34,6 +35,15 @@ func NewAggregator() *Aggregator {
 	}
 }
 
+// Aggregate overrides the base Aggregate method to retain per-file reports.
+func (a *Aggregator) Aggregate(results map[string]analyze.Report) {
+	for _, report := range results {
+		a.Retain(report)
+	}
+
+	a.Aggregator.Aggregate(results)
+}
+
 // aggregatorConfig holds the configuration for the aggregator.
 type aggregatorConfig struct {
 	messageBuilder     func(float64) string
diff --git a/internal/analyzers/cohesion/cohesion.go b/internal/analyzers/cohesion/cohesion.go
index 9320220..f8de1f4 100644
--- a/internal/analyzers/cohesion/cohesion.go
+++ b/internal/analyzers/cohesion/cohesion.go
@@ -168,7 +168,6 @@ func (c *Analyzer) calculateMetrics(functions []Function) map[string]float64 {
 }
 
 // buildResult constructs the final analysis result.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildResult(functions []Function, metrics map[string]float64) analyze.Report {
 	reportItems := c.buildDetailedFunctionsTable(functions)
 	message := c.getCohesionMessage(metrics["cohesion_score"])
@@ -188,7 +187,6 @@ func (c *Analyzer) buildResult(functions []Function, metrics map[string]float64)
 }
 
 // FunctionReportItem is a typed representation of a per-function cohesion report item.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 type FunctionReportItem struct {
 	Name               string
 	CohesionAssessment string
@@ -200,7 +198,6 @@ type FunctionReportItem struct {
 }
 
 // buildDetailedFunctionsTable creates the detailed functions table as typed structs.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildDetailedFunctionsTable(functions []Function) []FunctionReportItem {
 	items := make([]FunctionReportItem, 0, len(functions))
 
diff --git a/internal/analyzers/cohesion/metrics.go b/internal/analyzers/cohesion/metrics.go
index a4853cb..240ab21 100644
--- a/internal/analyzers/cohesion/metrics.go
+++ b/internal/analyzers/cohesion/metrics.go
@@ -23,8 +23,11 @@ type ReportData struct {
 
 // FunctionData holds cohesion data for a single function.
 type FunctionData struct {
-	Name     string
-	Cohesion float64
+	Name       string
+	SourceFile string
+	Language   string
+	Directory  string
+	Cohesion   float64
 }
 
 // ParseReportData extracts ReportData from an analyzer report.
@@ -51,35 +54,62 @@ func ParseReportData(report analyze.Report) (*ReportData, error) {
 		data.Message = v
 	}
 
-	// Parse functions.
-	if functions, ok := report["functions"].([]map[string]any); ok {
-		data.Functions = make([]FunctionData, 0, len(functions))
+	data.Functions = parseReportFunctions(report)
 
-		for _, fn := range functions {
-			fd := FunctionData{}
+	return data, nil
+}
 
-			if name, nameOK := fn["name"].(string); nameOK {
-				fd.Name = name
-			}
+func parseReportFunctions(report analyze.Report) []FunctionData {
+	functions, ok := report["functions"].([]map[string]any)
+	if !ok {
+		return nil
+	}
 
-			if v, vOK := fn["cohesion"].(float64); vOK {
-				fd.Cohesion = v
-			}
+	result := make([]FunctionData, 0, len(functions))
 
-			data.Functions = append(data.Functions, fd)
-		}
+	for _, fn := range functions {
+		result = append(result, parseFunctionData(fn))
 	}
 
-	return data, nil
+	return result
+}
+
+func parseFunctionData(fn map[string]any) FunctionData {
+	fd := FunctionData{}
+
+	if name, ok := fn["name"].(string); ok {
+		fd.Name = name
+	}
+
+	if sf, ok := fn[analyze.SourceFileKey].(string); ok {
+		fd.SourceFile = sf
+	}
+
+	if lang, ok := fn[analyze.LanguageKey].(string); ok {
+		fd.Language = lang
+	}
+
+	if dir, ok := fn[analyze.DirectoryKey].(string); ok {
+		fd.Directory = dir
+	}
+
+	if v, ok := fn["cohesion"].(float64); ok {
+		fd.Cohesion = v
+	}
+
+	return fd
 }
 
 // --- Output Data Types ---.
 
 // FunctionCohesionData contains cohesion data for a function.
 type FunctionCohesionData struct {
-	Name         string  `json:"name"          yaml:"name"`
-	Cohesion     float64 `json:"cohesion"      yaml:"cohesion"`
-	QualityLevel string  `json:"quality_level" yaml:"quality_level"`
+	Name         string  `json:"name"                  yaml:"name"`
+	SourceFile   string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language     string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory    string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	Cohesion     float64 `json:"cohesion"              yaml:"cohesion"`
+	QualityLevel string  `json:"quality_level"         yaml:"quality_level"`
 }
 
 // MetricDist* constants are JSON-compatible distribution keys for metrics output.
@@ -92,10 +122,13 @@ const (
 
 // LowCohesionFunctionData identifies functions with poor cohesion.
 type LowCohesionFunctionData struct {
-	Name           string  `json:"name"           yaml:"name"`
-	Cohesion       float64 `json:"cohesion"       yaml:"cohesion"`
-	RiskLevel      string  `json:"risk_level"     yaml:"risk_level"`
-	Recommendation string  `json:"recommendation" yaml:"recommendation"`
+	Name           string  `json:"name"                  yaml:"name"`
+	SourceFile     string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language       string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory      string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	Cohesion       float64 `json:"cohesion"              yaml:"cohesion"`
+	RiskLevel      string  `json:"risk_level"            yaml:"risk_level"`
+	Recommendation string  `json:"recommendation"        yaml:"recommendation"`
 }
 
 // AggregateData contains summary statistics.
@@ -149,6 +182,9 @@ func (m *FunctionCohesionMetric) Compute(input *ReportData) []FunctionCohesionDa
 
 		result = append(result, FunctionCohesionData{
 			Name:         fn.Name,
+			SourceFile:   fn.SourceFile,
+			Language:     fn.Language,
+			Directory:    fn.Directory,
 			Cohesion:     fn.Cohesion,
 			QualityLevel: qualityLevel,
 		})
@@ -249,6 +285,7 @@ func (m *LowCohesionFunctionMetric) Compute(input *ReportData) []LowCohesionFunc
 
 		result = append(result, LowCohesionFunctionData{
 			Name:           fn.Name,
+			SourceFile:     fn.SourceFile,
 			Cohesion:       fn.Cohesion,
 			RiskLevel:      riskLevel,
 			Recommendation: recommendation,
diff --git a/internal/analyzers/cohesion/report_section.go b/internal/analyzers/cohesion/report_section.go
index c2e9e92..dad3843 100644
--- a/internal/analyzers/cohesion/report_section.go
+++ b/internal/analyzers/cohesion/report_section.go
@@ -128,6 +128,7 @@ func (s *ReportSection) buildIssues() []analyze.Issue {
 		coh := reportutil.GetFloat64(fn, KeyFuncCohesion)
 		issues = append(issues, analyze.Issue{
 			Name:     name,
+			Location: reportutil.MapString(fn, analyze.SourceFileKey),
 			Value:    reportutil.FormatFloat(coh),
 			Severity: severityForCohesion(coh),
 		})
diff --git a/internal/analyzers/comments/aggregator.go b/internal/analyzers/comments/aggregator.go
index df41219..5666975 100644
--- a/internal/analyzers/comments/aggregator.go
+++ b/internal/analyzers/comments/aggregator.go
@@ -15,6 +15,7 @@ const (
 // Aggregator aggregates results from multiple comment analyses.
 type Aggregator struct {
 	*common.Aggregator
+	common.PerFileRetainer
 	detailed *common.DetailedDataCollector
 }
 
@@ -52,6 +53,10 @@ func (ca *Aggregator) SetAggregationMode(mode analyze.AggregationMode) {
 
 // Aggregate overrides the base Aggregate method to collect detailed comments and functions.
 func (ca *Aggregator) Aggregate(results map[string]analyze.Report) {
+	for _, report := range results {
+		ca.Retain(report)
+	}
+
 	ca.detailed.CollectFromReports(results)
 	ca.Aggregator.Aggregate(results)
 }
diff --git a/internal/analyzers/comments/comments.go b/internal/analyzers/comments/comments.go
index 0d04ace..9bf831b 100644
--- a/internal/analyzers/comments/comments.go
+++ b/internal/analyzers/comments/comments.go
@@ -609,7 +609,6 @@ func (c *Analyzer) buildEmptyResult() analyze.Report {
 }
 
 // buildResult builds the complete analysis result.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildResult(commentDetails []CommentDetail, functions []*node.Node, metrics CommentMetrics) analyze.Report {
 	commentDetailsInterface := c.buildCommentDetailsInterface(commentDetails)
 	detailedCommentsTable := c.buildDetailedCommentsTable(commentDetails)
@@ -660,7 +659,6 @@ func (c *Analyzer) buildCommentDetailsInterface(commentDetails []CommentDetail)
 }
 
 // buildDetailedCommentsTable builds the detailed comments table as typed structs.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildDetailedCommentsTable(commentDetails []CommentDetail) []CommentReportItem {
 	items := make([]CommentReportItem, 0, len(commentDetails))
 	for _, detail := range commentDetails {
@@ -680,7 +678,6 @@ func (c *Analyzer) buildDetailedCommentsTable(commentDetails []CommentDetail) []
 }
 
 // convertCommentReportItems converts typed comment items to []map[string]any for serialization.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func convertCommentReportItems(items any, sourceFile string) []map[string]any {
 	typed, ok := items.([]CommentReportItem)
 	if !ok {
@@ -708,7 +705,6 @@ func convertCommentReportItems(items any, sourceFile string) []map[string]any {
 }
 
 // buildDetailedFunctionsTable builds the detailed functions table as typed structs.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildDetailedFunctionsTable(functions []*node.Node, metrics CommentMetrics) []FunctionReportItem {
 	items := make([]FunctionReportItem, 0, len(functions))
 	for _, function := range functions {
@@ -732,7 +728,6 @@ func (c *Analyzer) buildDetailedFunctionsTable(functions []*node.Node, metrics C
 }
 
 // convertFunctionReportItems converts typed function items to []map[string]any for serialization.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func convertFunctionReportItems(items any, sourceFile string) []map[string]any {
 	typed, ok := items.([]FunctionReportItem)
 	if !ok {
diff --git a/internal/analyzers/comments/metrics.go b/internal/analyzers/comments/metrics.go
index 621bea5..de5d933 100644
--- a/internal/analyzers/comments/metrics.go
+++ b/internal/analyzers/comments/metrics.go
@@ -25,6 +25,9 @@ type ReportData struct {
 // CommentData holds data for a single comment.
 type CommentData struct {
 	LineNumber     int
+	SourceFile     string
+	Language       string
+	Directory      string
 	Quality        string
 	Type           string
 	TargetType     string
@@ -37,6 +40,9 @@ type CommentData struct {
 // FunctionCommentData holds comment data for a function.
 type FunctionCommentData struct {
 	Name         string
+	SourceFile   string
+	Language     string
+	Directory    string
 	HasComment   bool
 	NeedsComment bool
 	CommentScore float64
@@ -113,6 +119,18 @@ func parseComment(comment map[string]any) CommentData {
 		cd.LineNumber = v
 	}
 
+	if sf, ok := comment[analyze.SourceFileKey].(string); ok {
+		cd.SourceFile = sf
+	}
+
+	if lang, ok := comment[analyze.LanguageKey].(string); ok {
+		cd.Language = lang
+	}
+
+	if dir, ok := comment[analyze.DirectoryKey].(string); ok {
+		cd.Directory = dir
+	}
+
 	if v, ok := comment["quality"].(string); ok {
 		cd.Quality = v
 	}
@@ -166,6 +184,18 @@ func parseFunctionComment(fn map[string]any) FunctionCommentData {
 		fd.Name = v
 	}
 
+	if sf, ok := fn[analyze.SourceFileKey].(string); ok {
+		fd.SourceFile = sf
+	}
+
+	if lang, ok := fn[analyze.LanguageKey].(string); ok {
+		fd.Language = lang
+	}
+
+	if dir, ok := fn[analyze.DirectoryKey].(string); ok {
+		fd.Directory = dir
+	}
+
 	if v, ok := fn["has_comment"].(bool); ok {
 		fd.HasComment = v
 	}
@@ -190,6 +220,9 @@ func parseFunctionComment(fn map[string]any) FunctionCommentData {
 // CommentQualityData contains quality assessment for a comment.
 type CommentQualityData struct {
 	LineNumber     int     `json:"line_number"              yaml:"line_number"`
+	SourceFile     string  `json:"source_file,omitempty"    yaml:"source_file,omitempty"`
+	Language       string  `json:"language,omitempty"       yaml:"language,omitempty"`
+	Directory      string  `json:"directory,omitempty"      yaml:"directory,omitempty"`
 	Quality        string  `json:"quality"                  yaml:"quality"`
 	Type           string  `json:"type"                     yaml:"type"`
 	TargetName     string  `json:"target_name"              yaml:"target_name"`
@@ -199,17 +232,23 @@ type CommentQualityData struct {
 
 // FunctionDocumentationData contains documentation status for a function.
 type FunctionDocumentationData struct {
-	Name               string  `json:"name"                yaml:"name"`
-	IsDocumented       bool    `json:"is_documented"       yaml:"is_documented"`
-	DocumentationScore float64 `json:"documentation_score" yaml:"documentation_score"`
-	Status             string  `json:"status"              yaml:"status"`
+	Name               string  `json:"name"                  yaml:"name"`
+	SourceFile         string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language           string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory          string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	IsDocumented       bool    `json:"is_documented"         yaml:"is_documented"`
+	DocumentationScore float64 `json:"documentation_score"   yaml:"documentation_score"`
+	Status             string  `json:"status"                yaml:"status"`
 }
 
 // UndocumentedFunctionData identifies functions lacking documentation.
 type UndocumentedFunctionData struct {
-	Name         string `json:"name"          yaml:"name"`
-	NeedsComment bool   `json:"needs_comment" yaml:"needs_comment"`
-	RiskLevel    string `json:"risk_level"    yaml:"risk_level"`
+	Name         string `json:"name"                  yaml:"name"`
+	SourceFile   string `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language     string `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory    string `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	NeedsComment bool   `json:"needs_comment"         yaml:"needs_comment"`
+	RiskLevel    string `json:"risk_level"            yaml:"risk_level"`
 }
 
 // AggregateData contains summary statistics.
@@ -251,6 +290,9 @@ func (m *CommentQualityMetric) Compute(input *ReportData) []CommentQualityData {
 	for _, comment := range input.Comments {
 		result = append(result, CommentQualityData{
 			LineNumber:     comment.LineNumber,
+			SourceFile:     comment.SourceFile,
+			Language:       comment.Language,
+			Directory:      comment.Directory,
 			Quality:        comment.Quality,
 			Type:           comment.Type,
 			TargetName:     comment.TargetName,
@@ -314,6 +356,9 @@ func (m *FunctionDocumentationMetric) Compute(input *ReportData) []FunctionDocum
 
 		result = append(result, FunctionDocumentationData{
 			Name:               fn.Name,
+			SourceFile:         fn.SourceFile,
+			Language:           fn.Language,
+			Directory:          fn.Directory,
 			IsDocumented:       fn.HasComment,
 			DocumentationScore: fn.CommentScore,
 			Status:             status,
@@ -364,6 +409,9 @@ func (m *UndocumentedFunctionMetric) Compute(input *ReportData) []UndocumentedFu
 
 		result = append(result, UndocumentedFunctionData{
 			Name:         fn.Name,
+			SourceFile:   fn.SourceFile,
+			Language:     fn.Language,
+			Directory:    fn.Directory,
 			NeedsComment: fn.NeedsComment,
 			RiskLevel:    riskLevel,
 		})
diff --git a/internal/analyzers/comments/report_section.go b/internal/analyzers/comments/report_section.go
index 71e7506..fd985ba 100644
--- a/internal/analyzers/comments/report_section.go
+++ b/internal/analyzers/comments/report_section.go
@@ -135,6 +135,7 @@ func (s *ReportSection) buildIssues() []analyze.Issue {
 		name := reportutil.MapString(fn, KeyFuncName)
 		issues = append(issues, analyze.Issue{
 			Name:     name,
+			Location: reportutil.MapString(fn, analyze.SourceFileKey),
 			Value:    IssueValueNoDoc,
 			Severity: analyze.SeverityPoor,
 		})
diff --git a/internal/analyzers/comments/types.go b/internal/analyzers/comments/types.go
index 5bfe9d9..dccbd1c 100644
--- a/internal/analyzers/comments/types.go
+++ b/internal/analyzers/comments/types.go
@@ -70,7 +70,6 @@ type CommentConfig struct {
 }
 
 // CommentReportItem is a typed representation of a per-comment report item.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 type CommentReportItem struct {
 	Comment    string
 	Placement  string
@@ -80,7 +79,6 @@ type CommentReportItem struct {
 }
 
 // FunctionReportItem is a typed representation of a per-function report item.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 type FunctionReportItem struct {
 	Function   string
 	Type       string
diff --git a/internal/analyzers/common/aggregation_mode_test.go b/internal/analyzers/common/aggregation_mode_test.go
index aa56dce..067a5f2 100644
--- a/internal/analyzers/common/aggregation_mode_test.go
+++ b/internal/analyzers/common/aggregation_mode_test.go
@@ -1,7 +1,5 @@
 package common
 
-// FRD: specs/frds/FRD-20260311-summary-only-aggregation.md.
-
 import (
 	"testing"
 
@@ -133,8 +131,6 @@ func TestAggregator_ImplementsAggregationModeAware(t *testing.T) {
 	require.NotNil(t, aware)
 }
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 func TestAggregator_EstimatedStateSize_Empty(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/common/aggregator_bench_test.go b/internal/analyzers/common/aggregator_bench_test.go
index aafced3..ef2bdf9 100644
--- a/internal/analyzers/common/aggregator_bench_test.go
+++ b/internal/analyzers/common/aggregator_bench_test.go
@@ -1,7 +1,5 @@
 package common
 
-// FRD: specs/frds/FRD-20260311-summary-only-aggregation.md.
-
 import (
 	"fmt"
 	"runtime"
@@ -42,8 +40,6 @@ func makeSyntheticReport(fileIndex, numFunctions int) analyze.Report {
 	}
 }
 
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
-
 // testFunctionMetrics is a typed struct for benchmark comparison.
 type testFunctionMetrics struct {
 	Name                 string
@@ -183,8 +179,6 @@ func BenchmarkTypedVsMapAccumulation(b *testing.B) {
 	})
 }
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 // benchEstimatedSizeReportCount is the number of reports for size estimation benchmark.
 const benchEstimatedSizeReportCount = 10000
 
diff --git a/internal/analyzers/common/checkpoint_helper_test.go b/internal/analyzers/common/checkpoint_helper_test.go
index 8b16cd1..0d950ba 100644
--- a/internal/analyzers/common/checkpoint_helper_test.go
+++ b/internal/analyzers/common/checkpoint_helper_test.go
@@ -1,7 +1,5 @@
 package common_test
 
-// FRD: specs/frds/FRD-20260302-checkpoint-helper.md.
-
 import (
 	"testing"
 
diff --git a/internal/analyzers/common/computed_metrics_test.go b/internal/analyzers/common/computed_metrics_test.go
index 1c5c5f9..892aa00 100644
--- a/internal/analyzers/common/computed_metrics_test.go
+++ b/internal/analyzers/common/computed_metrics_test.go
@@ -1,7 +1,5 @@
 package common_test
 
-// FRD: specs/frds/FRD-20260302-computed-metrics.md.
-
 import (
 	"testing"
 
diff --git a/internal/analyzers/common/context_stack_test.go b/internal/analyzers/common/context_stack_test.go
index 83858b5..76b9a84 100644
--- a/internal/analyzers/common/context_stack_test.go
+++ b/internal/analyzers/common/context_stack_test.go
@@ -1,7 +1,5 @@
 package common_test
 
-// FRD: specs/frds/FRD-20260302-context-stack.md.
-
 import (
 	"testing"
 
diff --git a/internal/analyzers/common/detailed_data_collector.go b/internal/analyzers/common/detailed_data_collector.go
index 3d5fc7f..db24cee 100644
--- a/internal/analyzers/common/detailed_data_collector.go
+++ b/internal/analyzers/common/detailed_data_collector.go
@@ -1,7 +1,5 @@
 package common
 
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
-
 import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 )
@@ -89,7 +87,9 @@ func (d *DetailedDataCollector) buildItems(key string) []map[string]any {
 	items := make([]map[string]any, 0, capacity)
 
 	for _, tc := range typed {
-		items = append(items, tc.ToMaps(tc.Items, tc.SourceFile)...)
+		converted := tc.ToMaps(tc.Items, tc.SourceFile)
+		stampCollectionMetadata(converted, tc)
+		items = append(items, converted...)
 	}
 
 	items = append(items, legacy...)
@@ -97,6 +97,21 @@ func (d *DetailedDataCollector) buildItems(key string) []map[string]any {
 	return items
 }
 
+// stampCollectionMetadata adds language and directory from a TypedCollection
+// to each converted map item. The converter only passes sourceFile; this
+// stamps the remaining metadata fields that TypedCollection carries.
+func stampCollectionMetadata(items []map[string]any, tc analyze.TypedCollection) {
+	for _, item := range items {
+		if tc.Language != "" {
+			item[analyze.LanguageKey] = tc.Language
+		}
+
+		if tc.Directory != "" {
+			item[analyze.DirectoryKey] = tc.Directory
+		}
+	}
+}
+
 // typedCollectionLen returns the length of a TypedCollection's Items slice
 // using a type switch for known slice types, falling back to 0.
 func typedCollectionLen(tc analyze.TypedCollection) int {
diff --git a/internal/analyzers/common/detailed_data_collector_test.go b/internal/analyzers/common/detailed_data_collector_test.go
index beb4207..f1c16ec 100644
--- a/internal/analyzers/common/detailed_data_collector_test.go
+++ b/internal/analyzers/common/detailed_data_collector_test.go
@@ -9,8 +9,6 @@ import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 )
 
-// FRD: specs/frds/FRD-20260303-detailed-data-collector.md.
-
 func TestNewDetailedDataCollector(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/common/filter_test.go b/internal/analyzers/common/filter_test.go
index 75b4f13..145bdb3 100644
--- a/internal/analyzers/common/filter_test.go
+++ b/internal/analyzers/common/filter_test.go
@@ -1,7 +1,5 @@
 package common_test
 
-// FRD: specs/frds/FRD-20260302-filter-by-interface.md.
-
 import (
 	"testing"
 
diff --git a/internal/analyzers/common/identity_mixin_test.go b/internal/analyzers/common/identity_mixin_test.go
index 8b545ee..449ae02 100644
--- a/internal/analyzers/common/identity_mixin_test.go
+++ b/internal/analyzers/common/identity_mixin_test.go
@@ -1,4 +1,3 @@
-// FRD: specs/frds/FRD-20260302-identity-mixin.md.
 package common_test
 
 import (
diff --git a/internal/analyzers/common/metrics_processor_test.go b/internal/analyzers/common/metrics_processor_test.go
index 25285ea..c2612e9 100644
--- a/internal/analyzers/common/metrics_processor_test.go
+++ b/internal/analyzers/common/metrics_processor_test.go
@@ -282,8 +282,6 @@ func TestMetricsProcessor_IntegrationWorkflow(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 func TestMetricsProcessor_EstimatedStateBytes_Empty(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/common/no_state_hibernation_test.go b/internal/analyzers/common/no_state_hibernation_test.go
index bd4af5f..015dca4 100644
--- a/internal/analyzers/common/no_state_hibernation_test.go
+++ b/internal/analyzers/common/no_state_hibernation_test.go
@@ -1,7 +1,5 @@
 package common_test
 
-// FRD: specs/frds/FRD-20260302-no-state-hibernation.md.
-
 import (
 	"testing"
 	"unsafe"
diff --git a/internal/analyzers/common/perfile_retainer.go b/internal/analyzers/common/perfile_retainer.go
new file mode 100644
index 0000000..aeafe19
--- /dev/null
+++ b/internal/analyzers/common/perfile_retainer.go
@@ -0,0 +1,91 @@
+package common
+
+import (
+	"maps"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+// PerFileRetainer stores per-file report snapshots during static analysis aggregation.
+// When enabled, each call to Retain stores a shallow clone of the report keyed by source file path.
+// When disabled (default), Retain is a no-op and PerFileResults returns nil.
+type PerFileRetainer struct {
+	enabled bool
+	reports map[string]analyze.Report
+}
+
+// SetPerFileMode enables or disables per-file report retention.
+func (r *PerFileRetainer) SetPerFileMode(enabled bool) {
+	r.enabled = enabled
+
+	if enabled && r.reports == nil {
+		r.reports = make(map[string]analyze.Report)
+	}
+}
+
+// Retain extracts the source file path from the report and stores a shallow clone.
+// No-op when per-file mode is disabled or the report has no source file path.
+func (r *PerFileRetainer) Retain(report analyze.Report) {
+	if !r.enabled || report == nil {
+		return
+	}
+
+	filePath := extractSourceFile(report)
+	if filePath == "" {
+		return
+	}
+
+	r.reports[filePath] = cloneReport(report)
+}
+
+// PerFileResults returns the retained per-file reports keyed by file path.
+// Returns nil when per-file mode is disabled or no files were retained.
+func (r *PerFileRetainer) PerFileResults() map[string]analyze.Report {
+	if !r.enabled || len(r.reports) == 0 {
+		return nil
+	}
+
+	return r.reports
+}
+
+// extractSourceFile finds the source file path from report values.
+// Checks top-level SourceFileKey first, then collection-level sources.
+func extractSourceFile(report analyze.Report) string {
+	if sf, ok := report[analyze.SourceFileKey].(string); ok && sf != "" {
+		return sf
+	}
+
+	return extractSourceFileFromCollections(report)
+}
+
+// extractSourceFileFromCollections checks TypedCollection.SourceFile and legacy _source_file items.
+func extractSourceFileFromCollections(report analyze.Report) string {
+	for _, val := range report {
+		if sf := sourceFileFromValue(val); sf != "" {
+			return sf
+		}
+	}
+
+	return ""
+}
+
+// sourceFileFromValue extracts a source file path from a single report value.
+func sourceFileFromValue(val any) string {
+	switch typed := val.(type) {
+	case analyze.TypedCollection:
+		return typed.SourceFile
+	case []map[string]any:
+		for _, item := range typed {
+			if sf, ok := item[analyze.SourceFileKey].(string); ok && sf != "" {
+				return sf
+			}
+		}
+	}
+
+	return ""
+}
+
+// cloneReport creates a shallow clone of a report map.
+func cloneReport(report analyze.Report) analyze.Report {
+	return maps.Clone(report)
+}
diff --git a/internal/analyzers/common/perfile_retainer_test.go b/internal/analyzers/common/perfile_retainer_test.go
new file mode 100644
index 0000000..c7dfd14
--- /dev/null
+++ b/internal/analyzers/common/perfile_retainer_test.go
@@ -0,0 +1,106 @@
+package common
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+func TestPerFileRetainer_Disabled_ReturnsNil(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+
+	retainer.Retain(analyze.Report{"total_functions": 5})
+
+	assert.Nil(t, retainer.PerFileResults())
+}
+
+func TestPerFileRetainer_Enabled_RetainsThreeFiles(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+	retainer.SetPerFileMode(true)
+
+	retainer.Retain(analyze.Report{
+		"total_functions": 3,
+		"functions":       analyze.TypedCollection{SourceFile: "/repo/a.go"},
+	})
+	retainer.Retain(analyze.Report{
+		"total_functions": 5,
+		"functions":       analyze.TypedCollection{SourceFile: "/repo/b.go"},
+	})
+	retainer.Retain(analyze.Report{
+		"total_functions": 2,
+		"functions":       analyze.TypedCollection{SourceFile: "/repo/c.go"},
+	})
+
+	results := retainer.PerFileResults()
+	require.Len(t, results, 3)
+	assert.Contains(t, results, "/repo/a.go")
+	assert.Contains(t, results, "/repo/b.go")
+	assert.Contains(t, results, "/repo/c.go")
+	assert.Equal(t, 3, results["/repo/a.go"]["total_functions"])
+}
+
+func TestPerFileRetainer_LegacyMapSlice(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+	retainer.SetPerFileMode(true)
+
+	retainer.Retain(analyze.Report{
+		"functions": []map[string]any{
+			{"name": "Foo", analyze.SourceFileKey: "/repo/legacy.go"},
+		},
+	})
+
+	results := retainer.PerFileResults()
+	require.Len(t, results, 1)
+	assert.Contains(t, results, "/repo/legacy.go")
+}
+
+func TestPerFileRetainer_NilReport(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+	retainer.SetPerFileMode(true)
+
+	retainer.Retain(nil)
+
+	assert.Nil(t, retainer.PerFileResults())
+}
+
+func TestPerFileRetainer_NoSourceFile(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+	retainer.SetPerFileMode(true)
+
+	retainer.Retain(analyze.Report{"total_functions": 5})
+
+	assert.Nil(t, retainer.PerFileResults())
+}
+
+func TestPerFileRetainer_CloneIsolation(t *testing.T) {
+	t.Parallel()
+
+	var retainer PerFileRetainer
+	retainer.SetPerFileMode(true)
+
+	report := analyze.Report{
+		"count":     10,
+		"functions": analyze.TypedCollection{SourceFile: "/repo/x.go"},
+	}
+
+	retainer.Retain(report)
+
+	// Mutate original — retained copy must not change.
+	report["count"] = 999
+
+	results := retainer.PerFileResults()
+	assert.Equal(t, 10, results["/repo/x.go"]["count"])
+}
diff --git a/internal/analyzers/common/plotpage/multipage_test.go b/internal/analyzers/common/plotpage/multipage_test.go
index 43dbeae..8695f64 100644
--- a/internal/analyzers/common/plotpage/multipage_test.go
+++ b/internal/analyzers/common/plotpage/multipage_test.go
@@ -1,7 +1,5 @@
 package plotpage
 
-// FRD: specs/frds/FRD-20260228-multipage-renderer.md.
-
 import (
 	"os"
 	"path/filepath"
diff --git a/internal/analyzers/common/renderer/json.go b/internal/analyzers/common/renderer/json.go
index 13d7af8..d9359ac 100644
--- a/internal/analyzers/common/renderer/json.go
+++ b/internal/analyzers/common/renderer/json.go
@@ -17,6 +17,18 @@ type JSONSection struct {
 	Metrics      []JSONMetric       `json:"metrics"`
 	Distribution []JSONDistribution `json:"distribution,omitempty"`
 	Issues       []JSONIssue        `json:"issues"`
+	Files        *[]JSONFileEntry   `json:"files,omitempty"`
+	Score        float64            `json:"score"`
+}
+
+// JSONFileEntry represents one file's analysis results within a section.
+type JSONFileEntry struct {
+	FilePath     string             `json:"file_path"`
+	ScoreLabel   string             `json:"score_label"`
+	Status       string             `json:"status"`
+	Metrics      []JSONMetric       `json:"metrics"`
+	Distribution []JSONDistribution `json:"distribution,omitempty"`
+	Issues       []JSONIssue        `json:"issues"`
 	Score        float64            `json:"score"`
 }
 
@@ -84,6 +96,83 @@ func SectionToJSON(section analyze.ReportSection) JSONSection {
 	}
 }
 
+// EnrichWithPerFileData injects per-file data and summary statistics into JSON sections.
+// Implements analyze.PerFileEnricher to avoid import cycles.
+func (r *JSONReport) EnrichWithPerFileData(
+	perFileResults map[string]map[string]analyze.Report,
+	rootPath string,
+	analyzers []analyze.FormattableAnalyzer,
+) {
+	// Build analyzer name → (section title, provider) mapping.
+	type analyzerInfo struct {
+		title    string
+		provider analyze.ReportSectionProvider
+	}
+
+	infoByName := make(map[string]analyzerInfo, len(analyzers))
+
+	for _, analyzer := range analyzers {
+		provider, ok := analyzer.(analyze.ReportSectionProvider)
+		if !ok {
+			continue
+		}
+
+		emptySection := provider.CreateReportSection(analyze.Report{})
+		infoByName[analyzer.Name()] = analyzerInfo{
+			title:    emptySection.SectionTitle(),
+			provider: provider,
+		}
+	}
+
+	// Build section title → index for O(1) lookup.
+	titleToIdx := make(map[string]int, len(r.Sections))
+	for idx, section := range r.Sections {
+		titleToIdx[section.Title] = idx
+	}
+
+	// Initialize all sections with empty files array (spec: empty array, not omitted).
+	for idx := range r.Sections {
+		emptyFiles := make([]JSONFileEntry, 0)
+		r.Sections[idx].Files = &emptyFiles
+	}
+
+	for analyzerName, fileReports := range perFileResults {
+		info, ok := infoByName[analyzerName]
+		if !ok {
+			continue
+		}
+
+		idx, found := titleToIdx[info.title]
+		if !found {
+			continue
+		}
+
+		files := make([]JSONFileEntry, 0, len(fileReports))
+		for filePath, report := range fileReports {
+			section := info.provider.CreateReportSection(report)
+			relPath := analyze.MakeRelativePath(filePath, rootPath)
+			files = append(files, SectionToJSONFileEntry(section, relPath))
+		}
+
+		r.Sections[idx].Files = &files
+	}
+}
+
+// SectionToJSONFileEntry converts a ReportSection to a JSONFileEntry for per-file output.
+func SectionToJSONFileEntry(section analyze.ReportSection, filePath string) JSONFileEntry {
+	base := SectionToJSON(section)
+
+	return JSONFileEntry{
+		FilePath:     filePath,
+		Score:        base.Score,
+		ScoreLabel:   base.ScoreLabel,
+		Status:       base.Status,
+		Metrics:      base.Metrics,
+		Distribution: base.Distribution,
+		Issues:       base.Issues,
+	}
+}
+
 // SectionsToJSON converts multiple ReportSections to a JSONReport with overall score.
 func SectionsToJSON(sections []analyze.ReportSection) JSONReport {
 	summary := NewExecutiveSummary(sections)
diff --git a/internal/analyzers/common/renderer/json_test.go b/internal/analyzers/common/renderer/json_test.go
index e4e3bb1..a3ee22a 100644
--- a/internal/analyzers/common/renderer/json_test.go
+++ b/internal/analyzers/common/renderer/json_test.go
@@ -192,3 +192,88 @@ func TestSectionsToJSON_Serializable(t *testing.T) {
 	assert.Contains(t, string(data), `"title":"COMPLEXITY"`)
 	assert.Contains(t, string(data), `"overall_score":0.8`)
 }
+
+func TestJSONSection_NoFiles_OmittedFromJSON(t *testing.T) {
+	t.Parallel()
+
+	section := JSONSection{
+		Title:      "COMPLEXITY",
+		Score:      0.8,
+		ScoreLabel: "8/10",
+		Status:     "Good",
+		Metrics:    []JSONMetric{{Label: "Total Functions", Value: "42"}},
+		Issues:     []JSONIssue{},
+	}
+
+	data, err := json.Marshal(section)
+	require.NoError(t, err)
+
+	jsonStr := string(data)
+	assert.NotContains(t, jsonStr, `"files"`, "files must be omitted when nil")
+}
+
+func TestJSONSection_WithFiles_IncludedInJSON(t *testing.T) {
+	t.Parallel()
+
+	section := JSONSection{
+		Title:      "COMPLEXITY",
+		Score:      0.8,
+		ScoreLabel: "8/10",
+		Status:     "Good",
+		Metrics:    []JSONMetric{{Label: "Total Functions", Value: "42"}},
+		Issues:     []JSONIssue{},
+		Files: &[]JSONFileEntry{
+			{
+				FilePath:   "pkg/foo/bar.go",
+				Score:      0.6,
+				ScoreLabel: "6/10",
+				Status:     "Fair",
+				Metrics:    []JSONMetric{{Label: "Total Functions", Value: "12"}},
+				Issues:     []JSONIssue{},
+			},
+		},
+	}
+
+	data, err := json.Marshal(section)
+	require.NoError(t, err)
+
+	jsonStr := string(data)
+	assert.Contains(t, jsonStr, `"files"`)
+	assert.Contains(t, jsonStr, `"file_path":"pkg/foo/bar.go"`)
+	assert.Contains(t, jsonStr, `"score":0.6`)
+}
+
+func TestJSONSection_PerFileRoundTrip(t *testing.T) {
+	t.Parallel()
+
+	original := JSONSection{
+		Title:      "HALSTEAD",
+		Score:      0.7,
+		ScoreLabel: "7/10",
+		Status:     "Fair",
+		Metrics:    []JSONMetric{{Label: "Volume", Value: "500"}},
+		Issues:     []JSONIssue{},
+		Files: &[]JSONFileEntry{
+			{
+				FilePath:   "cmd/main.go",
+				Score:      0.5,
+				ScoreLabel: "5/10",
+				Status:     "Moderate",
+				Metrics:    []JSONMetric{{Label: "Volume", Value: "200"}},
+				Issues:     []JSONIssue{},
+			},
+		},
+	}
+
+	data, err := json.Marshal(original)
+	require.NoError(t, err)
+
+	var decoded JSONSection
+	require.NoError(t, json.Unmarshal(data, &decoded))
+
+	assert.Equal(t, original.Title, decoded.Title)
+	require.NotNil(t, decoded.Files)
+	require.Len(t, *decoded.Files, 1)
+	assert.Equal(t, "cmd/main.go", (*decoded.Files)[0].FilePath)
+	assert.InDelta(t, 0.5, (*decoded.Files)[0].Score, 0.001)
+}
diff --git a/internal/analyzers/common/renderer/static_renderer.go b/internal/analyzers/common/renderer/static_renderer.go
index ca3910d..f1f6f25 100644
--- a/internal/analyzers/common/renderer/static_renderer.go
+++ b/internal/analyzers/common/renderer/static_renderer.go
@@ -18,8 +18,11 @@ func NewDefaultStaticRenderer() *DefaultStaticRenderer {
 }
 
 // SectionsToJSON converts report sections to a JSON-serializable value.
+// Returns a pointer to enable per-file enrichment via PerFileEnricher interface.
 func (r *DefaultStaticRenderer) SectionsToJSON(sections []analyze.ReportSection) any {
-	return SectionsToJSON(sections)
+	report := SectionsToJSON(sections)
+
+	return &report
 }
 
 // RenderText writes human-readable text output for the given sections.
diff --git a/internal/analyzers/common/reportutil/reportutil_test.go b/internal/analyzers/common/reportutil/reportutil_test.go
index 4d417af..04f7782 100644
--- a/internal/analyzers/common/reportutil/reportutil_test.go
+++ b/internal/analyzers/common/reportutil/reportutil_test.go
@@ -1,8 +1,5 @@
 package reportutil
 
-// FRD: specs/frds/FRD-20260302-safeconv-wiring.md.
-// FRD: specs/frds/FRD-20260306-reportutil-getas.md.
-
 import (
 	"testing"
 )
diff --git a/internal/analyzers/common/spillable_bench_test.go b/internal/analyzers/common/spillable_bench_test.go
index fa34223..54eeda6 100644
--- a/internal/analyzers/common/spillable_bench_test.go
+++ b/internal/analyzers/common/spillable_bench_test.go
@@ -1,7 +1,5 @@
 package common
 
-// FRD: specs/frds/FRD-20260311-spillable-data-collector.md.
-
 import (
 	"fmt"
 	"runtime"
diff --git a/internal/analyzers/common/spillable_data_collector_test.go b/internal/analyzers/common/spillable_data_collector_test.go
index c5a0b09..e7ecb36 100644
--- a/internal/analyzers/common/spillable_data_collector_test.go
+++ b/internal/analyzers/common/spillable_data_collector_test.go
@@ -1,7 +1,5 @@
 package common
 
-// FRD: specs/frds/FRD-20260311-spillable-data-collector.md.
-
 import (
 	"testing"
 
@@ -284,8 +282,6 @@ func TestSpillableDataCollector_NoSpillMatchesSpill(t *testing.T) {
 	assert.Equal(t, noSpillData, withSpillData)
 }
 
-// FRD: specs/frds/FRD-20260311-halstead-dedup.md.
-
 func TestSpillableDataCollector_CompositeKeys_PreventsCrossFileOverwrite(t *testing.T) {
 	t.Parallel()
 
@@ -398,8 +394,6 @@ func TestSpillableDataCollector_CompositeKeys_GetIdentifierKey(t *testing.T) {
 	assert.Equal(t, "name", sdc.GetIdentifierKey())
 }
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 func TestSpillableDataCollector_EstimatedBufferBytes_Empty(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/common/threshold_labeler_test.go b/internal/analyzers/common/threshold_labeler_test.go
index 6c45691..2408a8b 100644
--- a/internal/analyzers/common/threshold_labeler_test.go
+++ b/internal/analyzers/common/threshold_labeler_test.go
@@ -7,7 +7,6 @@ import (
 )
 
 // thresholdLabelerFixture returns a standard 4-bucket labeler for tests.
-// FRD: specs/frds/FRD-20260306-threshold-labeler.md.
 func thresholdLabelerFixture() ThresholdLabeler {
 	return ThresholdLabeler{
 		{Limit: 0.8, Label: "Excellent"},
diff --git a/internal/analyzers/common/uast_traversal_test.go b/internal/analyzers/common/uast_traversal_test.go
index 0a9c730..472b8de 100644
--- a/internal/analyzers/common/uast_traversal_test.go
+++ b/internal/analyzers/common/uast_traversal_test.go
@@ -363,8 +363,6 @@ func TestUASTTraverser_matchesRoles(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260310-find-nodes-predicate.md.
-
 func TestUASTTraverser_FindNodes(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/complexity/aggregator.go b/internal/analyzers/complexity/aggregator.go
index 148e30d..bd5c4e2 100644
--- a/internal/analyzers/complexity/aggregator.go
+++ b/internal/analyzers/complexity/aggregator.go
@@ -17,6 +17,7 @@ const msgGoodComplexity = "Good complexity - functions have reasonable complexit
 // Aggregator aggregates results from multiple complexity analyses.
 type Aggregator struct {
 	*common.Aggregator
+	common.PerFileRetainer
 	detailed      *common.DetailedDataCollector
 	maxComplexity int
 }
@@ -52,6 +53,10 @@ func (ca *Aggregator) SetAggregationMode(mode analyze.AggregationMode) {
 // Aggregate overrides the base Aggregate method to collect detailed functions
 // and track the true maximum complexity across all files.
 func (ca *Aggregator) Aggregate(results map[string]analyze.Report) {
+	for _, report := range results {
+		ca.Retain(report)
+	}
+
 	ca.detailed.CollectFromReports(results)
 	ca.trackMaxComplexity(results)
 	ca.Aggregator.Aggregate(results)
diff --git a/internal/analyzers/complexity/complexity.go b/internal/analyzers/complexity/complexity.go
index 04b4e3c..eaeacb0 100644
--- a/internal/analyzers/complexity/complexity.go
+++ b/internal/analyzers/complexity/complexity.go
@@ -86,7 +86,6 @@ type FunctionMetrics struct {
 
 // FunctionReportItem is a typed representation of a per-function complexity report item.
 // It includes assessment strings computed from thresholds, avoiding map[string]any allocation.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 type FunctionReportItem struct {
 	Name                 string
 	CyclomaticComplexity int
@@ -297,7 +296,6 @@ func (c *Analyzer) buildEmptyResult(message string) analyze.Report {
 }
 
 // buildDetailedFunctionsTable creates the detailed functions table as typed structs.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildDetailedFunctionsTable(
 	functionMetrics []FunctionMetrics,
 	config Config,
@@ -360,7 +358,6 @@ func (c *Analyzer) calculateAverageComplexity(totals map[string]int, functionCou
 }
 
 // buildResult constructs the final analysis result.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (c *Analyzer) buildResult(
 	functionCount int,
 	avgComplexity float64,
diff --git a/internal/analyzers/complexity/metrics.go b/internal/analyzers/complexity/metrics.go
index 317447f..20c2f5f 100644
--- a/internal/analyzers/complexity/metrics.go
+++ b/internal/analyzers/complexity/metrics.go
@@ -26,6 +26,9 @@ type ReportData struct {
 // FunctionData holds complexity data for a single function.
 type FunctionData struct {
 	Name                 string
+	SourceFile           string
+	Language             string
+	Directory            string
 	CyclomaticComplexity int
 	CognitiveComplexity  int
 	NestingDepth         int
@@ -100,6 +103,18 @@ func parseFunctionData(fn map[string]any) FunctionData {
 		fd.Name = name
 	}
 
+	if sf, ok := fn[analyze.SourceFileKey].(string); ok {
+		fd.SourceFile = sf
+	}
+
+	if lang, ok := fn[analyze.LanguageKey].(string); ok {
+		fd.Language = lang
+	}
+
+	if dir, ok := fn[analyze.DirectoryKey].(string); ok {
+		fd.Directory = dir
+	}
+
 	if v, ok := fn["cyclomatic_complexity"].(int); ok {
 		fd.CyclomaticComplexity = v
 	}
@@ -136,6 +151,9 @@ func parseFunctionData(fn map[string]any) FunctionData {
 // FunctionComplexityData contains detailed complexity for a function.
 type FunctionComplexityData struct {
 	Name                 string  `json:"name"                  yaml:"name"`
+	SourceFile           string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language             string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory            string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
 	CyclomaticComplexity int     `json:"cyclomatic_complexity" yaml:"cyclomatic_complexity"`
 	CognitiveComplexity  int     `json:"cognitive_complexity"  yaml:"cognitive_complexity"`
 	NestingDepth         int     `json:"nesting_depth"         yaml:"nesting_depth"`
@@ -154,6 +172,9 @@ const (
 // HighRiskFunctionData identifies functions needing refactoring attention.
 type HighRiskFunctionData struct {
 	Name                 string   `json:"name"                  yaml:"name"`
+	SourceFile           string   `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language             string   `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory            string   `json:"directory,omitempty"   yaml:"directory,omitempty"`
 	CyclomaticComplexity int      `json:"cyclomatic_complexity" yaml:"cyclomatic_complexity"`
 	CognitiveComplexity  int      `json:"cognitive_complexity"  yaml:"cognitive_complexity"`
 	RiskLevel            string   `json:"risk_level"            yaml:"risk_level"`
@@ -221,6 +242,9 @@ func (m *FunctionComplexityMetric) Compute(input *ReportData) []FunctionComplexi
 
 		result = append(result, FunctionComplexityData{
 			Name:                 fn.Name,
+			SourceFile:           fn.SourceFile,
+			Language:             fn.Language,
+			Directory:            fn.Directory,
 			CyclomaticComplexity: fn.CyclomaticComplexity,
 			CognitiveComplexity:  fn.CognitiveComplexity,
 			NestingDepth:         fn.NestingDepth,
@@ -351,6 +375,9 @@ func (m *HighRiskFunctionMetric) Compute(input *ReportData) []HighRiskFunctionDa
 
 		result = append(result, HighRiskFunctionData{
 			Name:                 fn.Name,
+			SourceFile:           fn.SourceFile,
+			Language:             fn.Language,
+			Directory:            fn.Directory,
 			CyclomaticComplexity: fn.CyclomaticComplexity,
 			CognitiveComplexity:  fn.CognitiveComplexity,
 			RiskLevel:            riskLevel,
diff --git a/internal/analyzers/complexity/metrics_test.go b/internal/analyzers/complexity/metrics_test.go
index df0e9ff..8f29adf 100644
--- a/internal/analyzers/complexity/metrics_test.go
+++ b/internal/analyzers/complexity/metrics_test.go
@@ -118,6 +118,42 @@ func TestParseReportData_WithAssessments(t *testing.T) {
 	assert.Equal(t, "low", data.Functions[0].NestingAssessment)
 }
 
+const testSourceFile = "pkg/auth/handler.go"
+
+func TestParseReportData_WithSourceFile(t *testing.T) {
+	t.Parallel()
+
+	report := analyze.Report{
+		"functions": []map[string]any{
+			{
+				"name":         testFunctionName,
+				"_source_file": testSourceFile,
+			},
+		},
+	}
+
+	data, err := ParseReportData(report)
+
+	require.NoError(t, err)
+	require.Len(t, data.Functions, 1)
+	assert.Equal(t, testSourceFile, data.Functions[0].SourceFile)
+}
+
+func TestFunctionComplexityMetric_Compute_SourceFile(t *testing.T) {
+	t.Parallel()
+
+	functions := []FunctionData{
+		{Name: testFunctionName, SourceFile: testSourceFile, CyclomaticComplexity: 5, LinesOfCode: testLinesOfCode},
+	}
+	metric := NewFunctionComplexityMetric()
+	input := makeTestReportData(functions)
+
+	result := metric.Compute(input)
+
+	require.Len(t, result, 1)
+	assert.Equal(t, testSourceFile, result[0].SourceFile)
+}
+
 // Helper to create test ReportData with functions.
 func makeTestReportData(functions []FunctionData) *ReportData {
 	return &ReportData{
diff --git a/internal/analyzers/complexity/plot.go b/internal/analyzers/complexity/plot.go
index 2c1fd84..fb1e3c7 100644
--- a/internal/analyzers/complexity/plot.go
+++ b/internal/analyzers/complexity/plot.go
@@ -3,6 +3,7 @@ package complexity
 import (
 	"errors"
 	"io"
+	"path/filepath"
 
 	"github.com/go-echarts/go-echarts/v2/charts"
 	"github.com/go-echarts/go-echarts/v2/opts"
@@ -157,11 +158,7 @@ func extractComplexityData(functions []map[string]any) (labels []string, cycloma
 	colors = make([]string, len(functions))
 
 	for i, fn := range functions {
-		if name, ok := fn["name"].(string); ok {
-			labels[i] = name
-		} else {
-			labels[i] = unknownName
-		}
+		labels[i] = formatPlotLabel(fn)
 
 		cyclomatic[i] = getCyclomaticValue(fn)
 		cognitive[i] = getCognitiveValue(fn)
@@ -171,6 +168,22 @@ func extractComplexityData(functions []map[string]any) (labels []string, cycloma
 	return labels, cyclomatic, cognitive, colors
 }
 
+// formatPlotLabel builds a chart label from function name and source file.
+// Shows "filename:func" when source_file is available, otherwise just the name.
+func formatPlotLabel(fn map[string]any) string {
+	name := reportutil.MapString(fn, "name")
+	if name == "" {
+		name = unknownName
+	}
+
+	sf := reportutil.MapString(fn, analyze.SourceFileKey)
+	if sf == "" {
+		return name
+	}
+
+	return filepath.Base(sf) + ":" + name
+}
+
 func getComplexityColor(complexity int) string {
 	switch {
 	case complexity <= cyclomaticYellowLine:
@@ -277,16 +290,12 @@ func createComplexityScatterChart(functions []map[string]any, co *plotpage.Chart
 		cyclomatic := getCyclomaticValue(fn)
 		cognitive := getCognitiveValue(fn)
 		nesting := getNestingValue(fn)
-		name := unknownName
-
-		if n, ok := fn["name"].(string); ok {
-			name = n
-		}
+		label := formatPlotLabel(fn)
 
 		symbolSize := scatterSymbolSize + nesting*nestingMultiplier
 
 		scatterData[i] = opts.ScatterData{
-			Value:      []any{cyclomatic, cognitive, name},
+			Value:      []any{cyclomatic, cognitive, label},
 			SymbolSize: symbolSize,
 		}
 	}
diff --git a/internal/analyzers/complexity/report_section.go b/internal/analyzers/complexity/report_section.go
index 75ac470..d1ad4b8 100644
--- a/internal/analyzers/complexity/report_section.go
+++ b/internal/analyzers/complexity/report_section.go
@@ -170,12 +170,14 @@ func (s *ReportSection) complexityIssues(limit int) []analyze.Issue {
 		cc := reportutil.GetInt(fn, KeyFuncCyclomatic)
 		cognitive := reportutil.GetInt(fn, KeyFuncCognitive)
 		nesting := reportutil.GetInt(fn, KeyFuncNesting)
+		location := reportutil.MapString(fn, analyze.SourceFileKey)
 		envelopes = append(envelopes, issueEnvelope{
 			cyclomatic: cc,
 			cognitive:  cognitive,
 			nesting:    nesting,
 			issue: analyze.Issue{
 				Name:     name,
+				Location: location,
 				Value:    fmt.Sprintf("%s%d | Cog=%d | Nest=%d", IssueValuePrefix, cc, cognitive, nesting),
 				Severity: severityForComplexity(cc),
 			},
diff --git a/internal/analyzers/composition/aggregator.go b/internal/analyzers/composition/aggregator.go
new file mode 100644
index 0000000..a6648cb
--- /dev/null
+++ b/internal/analyzers/composition/aggregator.go
@@ -0,0 +1,65 @@
+package composition
+
+import (
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common"
+	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
+)
+
+// Aggregator report keys.
+const (
+	keyBreakdown  = "breakdown"
+	keyPercentage = "percentages"
+	keyTotalFiles = "total_files"
+
+	percentMultiplier = 100.0
+)
+
+// Aggregator aggregates file composition results across multiple files.
+type Aggregator struct {
+	common.PerFileRetainer
+
+	counts     filehistory.CategoryCounts
+	totalFiles int
+}
+
+// NewAggregator creates a new composition Aggregator.
+func NewAggregator() *Aggregator {
+	return &Aggregator{}
+}
+
+// Aggregate accumulates per-file classification results.
+func (a *Aggregator) Aggregate(results map[string]analyze.Report) {
+	for _, report := range results {
+		a.Retain(report)
+		a.totalFiles++
+
+		cat, ok := report[keyCategory].(string)
+		if !ok {
+			continue
+		}
+
+		a.counts.Increment(filehistory.Category(cat))
+	}
+}
+
+// GetResult builds the aggregated composition report.
+func (a *Aggregator) GetResult() analyze.Report {
+	breakdown := make(map[string]int, len(filehistory.AllCategories))
+	percentages := make(map[string]float64, len(filehistory.AllCategories))
+
+	for _, cat := range filehistory.AllCategories {
+		count := a.counts.Get(cat)
+		breakdown[string(cat)] = count
+
+		if a.totalFiles > 0 {
+			percentages[string(cat)] = float64(count) / float64(a.totalFiles) * percentMultiplier
+		}
+	}
+
+	return analyze.Report{
+		keyBreakdown:  breakdown,
+		keyPercentage: percentages,
+		keyTotalFiles: a.totalFiles,
+	}
+}
diff --git a/internal/analyzers/composition/analyzer.go b/internal/analyzers/composition/analyzer.go
new file mode 100644
index 0000000..71ccf71
--- /dev/null
+++ b/internal/analyzers/composition/analyzer.go
@@ -0,0 +1,127 @@
+// Package composition provides a static file composition analyzer that classifies
+// files by type (source, vendor, generated, docs, config, binary, image) using enry.
+package composition
+
+import (
+	"encoding/json"
+	"fmt"
+	"io"
+
+	"gopkg.in/yaml.v3"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/reportutil"
+	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
+	"github.com/Sumatoshi-tech/codefang/pkg/pipeline"
+)
+
+// Analyzer constants.
+const (
+	analyzerName        = "composition"
+	analyzerFlag        = "composition"
+	analyzerID          = "static/composition"
+	analyzerDescription = "Classifies files by type (source, vendor, generated, docs, config, binary, image) using enry."
+
+	// keyCategory is the report key for the file category.
+	keyCategory = "category"
+)
+
+// Analyzer implements analyze.RawFileAnalyzer for file composition analysis.
+// It classifies files by type using enry-based detection on raw file content.
+type Analyzer struct {
+	classifier *filehistory.Classifier
+}
+
+// NewAnalyzer creates a new composition Analyzer.
+func NewAnalyzer() *Analyzer {
+	return &Analyzer{
+		classifier: filehistory.NewClassifier(),
+	}
+}
+
+// Name returns the analyzer name.
+func (a *Analyzer) Name() string { return analyzerName }
+
+// Flag returns the CLI flag name.
+func (a *Analyzer) Flag() string { return analyzerFlag }
+
+// Descriptor returns the analyzer descriptor.
+func (a *Analyzer) Descriptor() analyze.Descriptor {
+	return analyze.NewDescriptor(analyze.ModeStatic, analyzerName, analyzerDescription)
+}
+
+// ListConfigurationOptions returns available configuration options.
+func (a *Analyzer) ListConfigurationOptions() []pipeline.ConfigurationOption {
+	return nil
+}
+
+// Configure applies configuration facts.
+func (a *Analyzer) Configure(_ map[string]any) error {
+	return nil
+}
+
+// Thresholds returns metric thresholds. Composition is informational, no thresholds.
+func (a *Analyzer) Thresholds() analyze.Thresholds {
+	return nil
+}
+
+// CreateAggregator returns a new composition aggregator.
+func (a *Analyzer) CreateAggregator() analyze.ResultAggregator {
+	return NewAggregator()
+}
+
+// AnalyzeFileContent classifies a file by its path and content using enry.
+func (a *Analyzer) AnalyzeFileContent(path string, content []byte) (analyze.Report, error) {
+	category := a.classifier.Classify(path, content)
+
+	return analyze.Report{
+		keyCategory: string(category),
+	}, nil
+}
+
+// CreateReportSection creates a ReportSection from aggregated composition data.
+func (a *Analyzer) CreateReportSection(report analyze.Report) analyze.ReportSection {
+	return NewReportSection(report)
+}
+
+// FormatReport writes human-readable text output.
+func (a *Analyzer) FormatReport(report analyze.Report, writer io.Writer) error {
+	return encodeJSON(report, writer)
+}
+
+// FormatReportJSON writes JSON output.
+func (a *Analyzer) FormatReportJSON(report analyze.Report, writer io.Writer) error {
+	return encodeJSON(report, writer)
+}
+
+// FormatReportYAML writes YAML output.
+func (a *Analyzer) FormatReportYAML(report analyze.Report, writer io.Writer) error {
+	yamlErr := yaml.NewEncoder(writer).Encode(report)
+	if yamlErr != nil {
+		return fmt.Errorf("encode yaml: %w", yamlErr)
+	}
+
+	return nil
+}
+
+func encodeJSON(report analyze.Report, writer io.Writer) error {
+	encoder := json.NewEncoder(writer)
+	encoder.SetIndent("", "  ")
+
+	encodeErr := encoder.Encode(report)
+	if encodeErr != nil {
+		return fmt.Errorf("encode json: %w", encodeErr)
+	}
+
+	return nil
+}
+
+// FormatReportPlot writes plot output (same as JSON for composition).
+func (a *Analyzer) FormatReportPlot(report analyze.Report, writer io.Writer) error {
+	return a.FormatReportJSON(report, writer)
+}
+
+// FormatReportBinary writes binary envelope output.
+func (a *Analyzer) FormatReportBinary(report analyze.Report, writer io.Writer) error {
+	return reportutil.EncodeBinaryEnvelope(report, writer)
+}
diff --git a/internal/analyzers/composition/analyzer_test.go b/internal/analyzers/composition/analyzer_test.go
new file mode 100644
index 0000000..0239356
--- /dev/null
+++ b/internal/analyzers/composition/analyzer_test.go
@@ -0,0 +1,298 @@
+package composition
+
+import (
+	"bytes"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
+)
+
+func TestAnalyzer_Name(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	assert.Equal(t, analyzerName, a.Name())
+}
+
+func TestAnalyzer_Flag(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	assert.Equal(t, analyzerFlag, a.Flag())
+}
+
+func TestAnalyzer_Descriptor(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	d := a.Descriptor()
+	assert.Equal(t, analyze.ModeStatic, d.Mode)
+	assert.Equal(t, analyzerID, d.ID)
+}
+
+func TestAnalyzer_Thresholds_Nil(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	assert.Nil(t, a.Thresholds())
+}
+
+func TestAnalyzer_AnalyzeContent_GoFile(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent("pkg/main.go", []byte("package main\n\nfunc main() {}\n"))
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategorySource), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_VendorPath(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent("vendor/github.com/foo/bar.go", []byte("package bar\n"))
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryVendor), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_Markdown(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent("docs/README.md", []byte("# Hello\n"))
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryDocumentation), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_ConfigFile(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent(".golangci.yml", []byte("linters:\n  enable:\n"))
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryConfiguration), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_BinaryContent(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	// Binary content: null bytes trigger enry.IsBinary.
+	binary := []byte{0x00, 0x01, 0x02, 0xFF, 0xFE, 0x00, 0x00, 0x00}
+	report, err := a.AnalyzeFileContent("data.bin", binary)
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryBinary), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_DotFile(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent(".editorconfig", []byte("[*]\nindent_style = tab\n"))
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryDotFile), report[keyCategory])
+}
+
+func TestAnalyzer_AnalyzeContent_ImagePath(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	report, err := a.AnalyzeFileContent("logo.png", nil)
+	require.NoError(t, err)
+	assert.Equal(t, string(filehistory.CategoryImage), report[keyCategory])
+}
+
+func TestAnalyzer_CreateAggregator(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	agg := a.CreateAggregator()
+	require.NotNil(t, agg)
+}
+
+func TestAnalyzer_CreateReportSection(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	section := a.CreateReportSection(analyze.Report{})
+	require.NotNil(t, section)
+	assert.Equal(t, sectionTitle, section.SectionTitle())
+}
+
+func TestAnalyzer_ImplementsRawFileAnalyzer(t *testing.T) {
+	t.Parallel()
+
+	var _ analyze.RawFileAnalyzer = (*Analyzer)(nil)
+}
+
+func TestAnalyzer_FormatReportJSON(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	var buf bytes.Buffer
+
+	err := a.FormatReportJSON(analyze.Report{keyCategory: "source"}, &buf)
+	require.NoError(t, err)
+	assert.Contains(t, buf.String(), "source")
+}
+
+func TestAnalyzer_FormatReportYAML(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	var buf bytes.Buffer
+
+	err := a.FormatReportYAML(analyze.Report{keyCategory: "vendor"}, &buf)
+	require.NoError(t, err)
+	assert.Contains(t, buf.String(), "vendor")
+}
+
+func TestAnalyzer_FormatReport(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	var buf bytes.Buffer
+
+	err := a.FormatReport(analyze.Report{keyCategory: "binary"}, &buf)
+	require.NoError(t, err)
+	assert.Contains(t, buf.String(), "binary")
+}
+
+func TestAnalyzer_FormatReportPlot(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	var buf bytes.Buffer
+
+	err := a.FormatReportPlot(analyze.Report{keyCategory: "docs"}, &buf)
+	require.NoError(t, err)
+	assert.Contains(t, buf.String(), "docs")
+}
+
+func TestAnalyzer_FormatReportBinary(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+
+	var buf bytes.Buffer
+
+	err := a.FormatReportBinary(analyze.Report{keyCategory: "source"}, &buf)
+	require.NoError(t, err)
+	assert.NotEmpty(t, buf.Bytes())
+}
+
+func TestAnalyzer_Configure_NoError(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	assert.NoError(t, a.Configure(nil))
+}
+
+func TestAnalyzer_ListConfigurationOptions_Empty(t *testing.T) {
+	t.Parallel()
+
+	a := NewAnalyzer()
+	assert.Nil(t, a.ListConfigurationOptions())
+}
+
+// Aggregator tests.
+
+func TestAggregator_EmptyResult(t *testing.T) {
+	t.Parallel()
+
+	agg := NewAggregator()
+	result := agg.GetResult()
+
+	total, ok := result[keyTotalFiles].(int)
+	require.True(t, ok)
+	assert.Equal(t, 0, total)
+}
+
+func TestAggregator_SingleFile(t *testing.T) {
+	t.Parallel()
+
+	agg := NewAggregator()
+	agg.Aggregate(map[string]analyze.Report{
+		analyzerName: {keyCategory: string(filehistory.CategorySource)},
+	})
+
+	result := agg.GetResult()
+
+	total, ok := result[keyTotalFiles].(int)
+	require.True(t, ok)
+	assert.Equal(t, 1, total)
+
+	breakdown, ok := result[keyBreakdown].(map[string]int)
+	require.True(t, ok)
+	assert.Equal(t, 1, breakdown[string(filehistory.CategorySource)])
+}
+
+func TestAggregator_MultipleFiles(t *testing.T) {
+	t.Parallel()
+
+	agg := NewAggregator()
+
+	// 3 source + 1 vendor + 1 docs = 5 total.
+	files := []filehistory.Category{
+		filehistory.CategorySource,
+		filehistory.CategorySource,
+		filehistory.CategorySource,
+		filehistory.CategoryVendor,
+		filehistory.CategoryDocumentation,
+	}
+
+	for _, cat := range files {
+		agg.Aggregate(map[string]analyze.Report{
+			analyzerName: {keyCategory: string(cat)},
+		})
+	}
+
+	result := agg.GetResult()
+
+	total, ok := result[keyTotalFiles].(int)
+	require.True(t, ok)
+	assert.Equal(t, len(files), total)
+
+	breakdown, ok := result[keyBreakdown].(map[string]int)
+	require.True(t, ok)
+	assert.Equal(t, 3, breakdown[string(filehistory.CategorySource)])
+	assert.Equal(t, 1, breakdown[string(filehistory.CategoryVendor)])
+	assert.Equal(t, 1, breakdown[string(filehistory.CategoryDocumentation)])
+
+	percentages, ok := result[keyPercentage].(map[string]float64)
+	require.True(t, ok)
+	assert.InDelta(t, 60.0, percentages[string(filehistory.CategorySource)], 0.1)
+	assert.InDelta(t, 20.0, percentages[string(filehistory.CategoryVendor)], 0.1)
+	assert.InDelta(t, 20.0, percentages[string(filehistory.CategoryDocumentation)], 0.1)
+}
+
+func TestAggregator_SkipsInvalidCategory(t *testing.T) {
+	t.Parallel()
+
+	agg := NewAggregator()
+	agg.Aggregate(map[string]analyze.Report{
+		analyzerName: {"not_a_category": 42},
+	})
+
+	result := agg.GetResult()
+
+	total, ok := result[keyTotalFiles].(int)
+	require.True(t, ok)
+	// File counted but no category incremented.
+	assert.Equal(t, 1, total)
+}
diff --git a/internal/analyzers/composition/report_section.go b/internal/analyzers/composition/report_section.go
new file mode 100644
index 0000000..7e9b3d8
--- /dev/null
+++ b/internal/analyzers/composition/report_section.go
@@ -0,0 +1,164 @@
+package composition
+
+import (
+	"fmt"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/reportutil"
+	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
+)
+
+// Report section display constants.
+const (
+	sectionTitle     = "COMPOSITION"
+	metricTotalFiles = "Total Files"
+	metricSource     = "Source Files"
+	metricSourcePct  = "Source %"
+
+	statusDefault = "File composition analysis completed"
+	statusEmpty   = "No files analyzed"
+)
+
+// ReportSection implements analyze.ReportSection for composition analysis.
+type ReportSection struct {
+	analyze.BaseReportSection
+
+	report analyze.Report
+}
+
+// NewReportSection creates a ReportSection from aggregated composition data.
+func NewReportSection(report analyze.Report) *ReportSection {
+	msg := statusDefault
+
+	total := reportutil.GetInt(report, keyTotalFiles)
+	if total == 0 {
+		msg = statusEmpty
+	}
+
+	return &ReportSection{
+		BaseReportSection: analyze.BaseReportSection{
+			Title:      sectionTitle,
+			Message:    msg,
+			ScoreValue: analyze.ScoreInfoOnly,
+		},
+		report: report,
+	}
+}
+
+// KeyMetrics returns ordered key metrics for display.
+func (s *ReportSection) KeyMetrics() []analyze.Metric {
+	total := reportutil.GetInt(s.report, keyTotalFiles)
+	breakdown := getBreakdown(s.report)
+	sourceCount := breakdown[string(filehistory.CategorySource)]
+
+	return []analyze.Metric{
+		{Label: metricTotalFiles, Value: reportutil.FormatInt(total)},
+		{Label: metricSource, Value: reportutil.FormatInt(sourceCount)},
+		{Label: metricSourcePct, Value: reportutil.FormatPercent(reportutil.Pct(sourceCount, total))},
+	}
+}
+
+// Distribution returns category breakdown as distribution items.
+func (s *ReportSection) Distribution() []analyze.DistributionItem {
+	breakdown := getBreakdown(s.report)
+	total := reportutil.GetInt(s.report, keyTotalFiles)
+
+	if total == 0 {
+		return nil
+	}
+
+	items := make([]analyze.DistributionItem, 0, len(filehistory.AllCategories))
+
+	for _, cat := range filehistory.AllCategories {
+		count := breakdown[string(cat)]
+		if count == 0 {
+			continue
+		}
+
+		items = append(items, analyze.DistributionItem{
+			Label:   string(cat),
+			Percent: reportutil.Pct(count, total),
+			Count:   count,
+		})
+	}
+
+	return items
+}
+
+// TopIssues returns the top N non-source files as issues.
+func (s *ReportSection) TopIssues(n int) []analyze.Issue {
+	return s.buildIssues(n)
+}
+
+// AllIssues returns all non-source files as issues.
+func (s *ReportSection) AllIssues() []analyze.Issue {
+	return s.buildIssues(0)
+}
+
+// buildIssues creates issues for non-source categories showing file counts.
+func (s *ReportSection) buildIssues(limit int) []analyze.Issue {
+	breakdown := getBreakdown(s.report)
+	total := reportutil.GetInt(s.report, keyTotalFiles)
+
+	if total == 0 {
+		return nil
+	}
+
+	issues := make([]analyze.Issue, 0, len(filehistory.AllCategories))
+
+	for _, cat := range filehistory.AllCategories {
+		if cat == filehistory.CategorySource {
+			continue
+		}
+
+		count := breakdown[string(cat)]
+		if count == 0 {
+			continue
+		}
+
+		issues = append(issues, analyze.Issue{
+			Name:     string(cat),
+			Value:    fmt.Sprintf("%d files (%.1f%%)", count, float64(count)/float64(total)*percentMultiplier),
+			Severity: severityForCategory(cat),
+		})
+	}
+
+	if limit > 0 && len(issues) > limit {
+		issues = issues[:limit]
+	}
+
+	return issues
+}
+
+// severityForCategory returns the appropriate severity for a file category.
+func severityForCategory(cat filehistory.Category) string {
+	switch cat {
+	case filehistory.CategoryBinary:
+		return analyze.SeverityPoor
+	case filehistory.CategorySource,
+		filehistory.CategoryVendor,
+		filehistory.CategoryGenerated,
+		filehistory.CategoryDocumentation,
+		filehistory.CategoryConfiguration,
+		filehistory.CategoryImage,
+		filehistory.CategoryDotFile:
+		return analyze.SeverityInfo
+	}
+
+	return analyze.SeverityInfo
+}
+
+// getBreakdown extracts the breakdown map from a report.
+func getBreakdown(report analyze.Report) map[string]int {
+	raw, ok := report[keyBreakdown]
+	if !ok {
+		return nil
+	}
+
+	m, isMap := raw.(map[string]int)
+	if isMap {
+		return m
+	}
+
+	return nil
+}
diff --git a/internal/analyzers/composition/report_section_test.go b/internal/analyzers/composition/report_section_test.go
new file mode 100644
index 0000000..5a5e3b2
--- /dev/null
+++ b/internal/analyzers/composition/report_section_test.go
@@ -0,0 +1,193 @@
+package composition
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
+)
+
+func newTestCompositionReport() analyze.Report {
+	return analyze.Report{
+		keyTotalFiles: 10,
+		keyBreakdown: map[string]int{
+			string(filehistory.CategorySource):        6,
+			string(filehistory.CategoryVendor):        2,
+			string(filehistory.CategoryDocumentation): 1,
+			string(filehistory.CategoryBinary):        1,
+		},
+		keyPercentage: map[string]float64{
+			string(filehistory.CategorySource):        60.0,
+			string(filehistory.CategoryVendor):        20.0,
+			string(filehistory.CategoryDocumentation): 10.0,
+			string(filehistory.CategoryBinary):        10.0,
+		},
+	}
+}
+
+func TestCompositionSection_Title(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	assert.Equal(t, sectionTitle, s.SectionTitle())
+}
+
+func TestCompositionSection_Score_InfoOnly(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	assert.InDelta(t, analyze.ScoreInfoOnly, s.Score(), 0.001)
+}
+
+func TestCompositionSection_StatusMessage(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	assert.Equal(t, statusDefault, s.StatusMessage())
+}
+
+func TestCompositionSection_StatusMessage_Empty(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(analyze.Report{})
+	assert.Equal(t, statusEmpty, s.StatusMessage())
+}
+
+func TestCompositionSection_NilReport(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(nil)
+	assert.Equal(t, sectionTitle, s.SectionTitle())
+	assert.Equal(t, statusEmpty, s.StatusMessage())
+}
+
+func TestCompositionSection_KeyMetrics_Count(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	metrics := s.KeyMetrics()
+
+	const expectedMetrics = 3
+	require.Len(t, metrics, expectedMetrics)
+}
+
+func TestCompositionSection_KeyMetrics_Labels(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	metrics := s.KeyMetrics()
+
+	assert.Equal(t, metricTotalFiles, metrics[0].Label)
+	assert.Equal(t, metricSource, metrics[1].Label)
+	assert.Equal(t, metricSourcePct, metrics[2].Label)
+}
+
+func TestCompositionSection_KeyMetrics_Values(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	metrics := s.KeyMetrics()
+
+	assert.Equal(t, "10", metrics[0].Value)
+	assert.Equal(t, "6", metrics[1].Value)
+	assert.Contains(t, metrics[2].Value, "60")
+}
+
+func TestCompositionSection_Distribution(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+	dist := s.Distribution()
+
+	require.NotNil(t, dist)
+	// 4 categories with non-zero counts.
+	require.Len(t, dist, 4)
+
+	// First should be source (order follows AllCategories).
+	assert.Equal(t, string(filehistory.CategorySource), dist[0].Label)
+	assert.Equal(t, 6, dist[0].Count)
+}
+
+func TestCompositionSection_Distribution_Empty(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(analyze.Report{})
+	assert.Nil(t, s.Distribution())
+}
+
+func TestCompositionSection_TopIssues(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+
+	issues := s.TopIssues(2)
+	require.Len(t, issues, 2)
+}
+
+func TestCompositionSection_AllIssues(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+
+	issues := s.AllIssues()
+	// 3 non-source categories with counts: vendor, docs, binary.
+	require.Len(t, issues, 3)
+}
+
+func TestCompositionSection_Issues_BinarySeverityPoor(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+
+	issues := s.AllIssues()
+
+	var binaryIssue *analyze.Issue
+
+	for idx := range issues {
+		if issues[idx].Name == string(filehistory.CategoryBinary) {
+			binaryIssue = &issues[idx]
+
+			break
+		}
+	}
+
+	require.NotNil(t, binaryIssue, "binary category must appear in issues")
+	assert.Equal(t, analyze.SeverityPoor, binaryIssue.Severity)
+}
+
+func TestCompositionSection_Issues_VendorSeverityInfo(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(newTestCompositionReport())
+
+	issues := s.AllIssues()
+
+	var vendorIssue *analyze.Issue
+
+	for idx := range issues {
+		if issues[idx].Name == string(filehistory.CategoryVendor) {
+			vendorIssue = &issues[idx]
+
+			break
+		}
+	}
+
+	require.NotNil(t, vendorIssue, "vendor category must appear in issues")
+	assert.Equal(t, analyze.SeverityInfo, vendorIssue.Severity)
+}
+
+func TestCompositionSection_Issues_Empty(t *testing.T) {
+	t.Parallel()
+
+	s := NewReportSection(analyze.Report{})
+	assert.Nil(t, s.AllIssues())
+}
+
+func TestCompositionSection_ImplementsInterface(t *testing.T) {
+	t.Parallel()
+
+	var _ analyze.ReportSection = (*ReportSection)(nil)
+}
diff --git a/internal/analyzers/couples/metrics.go b/internal/analyzers/couples/metrics.go
index cd0b419..8fb347c 100644
--- a/internal/analyzers/couples/metrics.go
+++ b/internal/analyzers/couples/metrics.go
@@ -6,6 +6,7 @@ import (
 	"sort"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/identity"
 	"github.com/Sumatoshi-tech/codefang/pkg/alg/hll"
 	"github.com/Sumatoshi-tech/codefang/pkg/metrics"
 )
@@ -74,10 +75,12 @@ type FileCouplingData struct {
 
 // DeveloperCouplingData contains coupling data for a developer pair.
 type DeveloperCouplingData struct {
-	Developer1  string  `json:"developer1"          yaml:"developer1"`
-	Developer2  string  `json:"developer2"          yaml:"developer2"`
-	SharedFiles int64   `json:"shared_file_changes" yaml:"shared_file_changes"`
-	Strength    float64 `json:"coupling_strength"   yaml:"coupling_strength"`
+	Developer1      string  `json:"developer1"                 yaml:"developer1"`
+	Developer1Email string  `json:"developer1_email,omitempty" yaml:"developer1_email,omitempty"`
+	Developer2      string  `json:"developer2"                 yaml:"developer2"`
+	Developer2Email string  `json:"developer2_email,omitempty" yaml:"developer2_email,omitempty"`
+	SharedFiles     int64   `json:"shared_file_changes"        yaml:"shared_file_changes"`
+	Strength        float64 `json:"coupling_strength"          yaml:"coupling_strength"`
 }
 
 // FileOwnershipData contains ownership information for a file.
@@ -207,7 +210,7 @@ func (m *DeveloperCouplingMetric) Compute(input *ReportData) []DeveloperCoupling
 }
 
 func computeDevCouplings(devIdx int, row map[int]int64, matrix []map[int]int64, names []string) []DeveloperCouplingData {
-	dev1 := getDevName(devIdx, names)
+	dev1Name, dev1Email := getDevNameAndEmail(devIdx, names)
 
 	var result []DeveloperCouplingData
 
@@ -221,16 +224,21 @@ func computeDevCouplings(devIdx int, row map[int]int64, matrix []map[int]int64,
 			selfDev2 = matrix[j][j]
 		}
 
-		coupling := buildCouplingData(dev1, j, sharedChanges, row[devIdx], selfDev2, names)
+		dev2Name, dev2Email := getDevNameAndEmail(j, names)
+		coupling := buildCouplingData(
+			dev1Name, dev1Email, dev2Name, dev2Email,
+			sharedChanges, row[devIdx], selfDev2,
+		)
 		result = append(result, coupling)
 	}
 
 	return result
 }
 
-func buildCouplingData(dev1 string, dev2Idx int, sharedChanges, selfDev1, selfDev2 int64, names []string) DeveloperCouplingData {
-	dev2 := getDevName(dev2Idx, names)
-
+func buildCouplingData(
+	dev1Name, dev1Email, dev2Name, dev2Email string,
+	sharedChanges, selfDev1, selfDev2 int64,
+) DeveloperCouplingData {
 	// Coupling strength using code-maat formula:
 	// degree = shared_changes / average(self_dev1, self_dev2), capped at 1.0.
 	avgRevs := float64(selfDev1+selfDev2) / pairCount
@@ -241,19 +249,21 @@ func buildCouplingData(dev1 string, dev2Idx int, sharedChanges, selfDev1, selfDe
 	}
 
 	return DeveloperCouplingData{
-		Developer1:  dev1,
-		Developer2:  dev2,
-		SharedFiles: sharedChanges,
-		Strength:    strength,
+		Developer1:      dev1Name,
+		Developer1Email: dev1Email,
+		Developer2:      dev2Name,
+		Developer2Email: dev2Email,
+		SharedFiles:     sharedChanges,
+		Strength:        strength,
 	}
 }
 
-func getDevName(idx int, names []string) string {
+func getDevNameAndEmail(idx int, names []string) (name, email string) {
 	if idx < len(names) {
-		return names[idx]
+		return identity.SplitIdentity(names[idx])
 	}
 
-	return ""
+	return "", ""
 }
 
 // FileOwnershipMetric computes file ownership information.
diff --git a/internal/analyzers/couples/store_writer_test.go b/internal/analyzers/couples/store_writer_test.go
index 6f006bf..48dae53 100644
--- a/internal/analyzers/couples/store_writer_test.go
+++ b/internal/analyzers/couples/store_writer_test.go
@@ -1,7 +1,5 @@
 package couples
 
-// FRD: specs/frds/FRD-20260228-couples-store-writer.md.
-
 import (
 	"context"
 	"sort"
diff --git a/internal/analyzers/devs/analyzer.go b/internal/analyzers/devs/analyzer.go
index 9fff651..15221d1 100644
--- a/internal/analyzers/devs/analyzer.go
+++ b/internal/analyzers/devs/analyzer.go
@@ -38,7 +38,9 @@ type DevTick struct {
 // It groups all per-commit developer data within one time bucket.
 type TickDevData struct {
 	// DevData maps commit hash hex to per-commit developer statistics.
-	DevData map[string]*CommitDevData
+	DevData   map[string]*CommitDevData
+	startTime time.Time
+	endTime   time.Time
 }
 
 // Configuration option keys for the devs analyzer.
@@ -350,10 +352,24 @@ func extractTC(tc analyze.TC, byTick map[int]*TickDevData) error {
 
 	state, ok := byTick[tc.Tick]
 	if !ok || state == nil {
-		state = &TickDevData{DevData: make(map[string]*CommitDevData)}
+		state = &TickDevData{
+			DevData:   make(map[string]*CommitDevData),
+			startTime: tc.Timestamp,
+			endTime:   tc.Timestamp,
+		}
 		byTick[tc.Tick] = state
 	}
 
+	if !tc.Timestamp.IsZero() {
+		if tc.Timestamp.Before(state.startTime) || state.startTime.IsZero() {
+			state.startTime = tc.Timestamp
+		}
+
+		if tc.Timestamp.After(state.endTime) {
+			state.endTime = tc.Timestamp
+		}
+	}
+
 	state.DevData[tc.CommitHash.String()] = cdd
 
 	return nil
@@ -404,8 +420,10 @@ func buildTick(tick int, state *TickDevData) (analyze.TICK, error) {
 	}
 
 	return analyze.TICK{
-		Tick: tick,
-		Data: state,
+		Tick:      tick,
+		StartTime: state.startTime,
+		EndTime:   state.endTime,
+		Data:      state,
 	}, nil
 }
 
@@ -500,5 +518,6 @@ func ticksToReport(
 		"CommitsByTick":      commitsByTick,
 		"ReversedPeopleDict": names,
 		"TickSize":           tickSize,
+		"tick_bounds":        analyze.BuildTickBounds(ticks),
 	}
 }
diff --git a/internal/analyzers/devs/dashboard_activity.go b/internal/analyzers/devs/dashboard_activity.go
index 62d4adf..112a788 100644
--- a/internal/analyzers/devs/dashboard_activity.go
+++ b/internal/analyzers/devs/dashboard_activity.go
@@ -69,7 +69,7 @@ func buildTopDevSeries(data *DashboardData, topDevs []int, nameByID map[int]stri
 	for _, devID := range topDevs {
 		seriesData := make([]plotpage.SeriesData, len(data.Metrics.Activity))
 		for i, ad := range data.Metrics.Activity {
-			seriesData[i] = ad.ByDeveloper[devID]
+			seriesData[i] = commitsForDev(ad.ByDeveloper, devID)
 		}
 
 		series = append(series, plotpage.LineSeries{
@@ -89,9 +89,9 @@ func buildOthersSeries(data *DashboardData, topDevs []int) plotpage.LineSeries {
 	for i, ad := range data.Metrics.Activity {
 		total := 0
 
-		for devID, commits := range ad.ByDeveloper {
-			if !slices.Contains(topDevs, devID) {
-				total += commits
+		for _, dc := range ad.ByDeveloper {
+			if !slices.Contains(topDevs, dc.DevID) {
+				total += dc.Commits
 			}
 		}
 
@@ -119,3 +119,14 @@ func getTopDevIDs(developers []DeveloperData, limit int) []int {
 
 	return ids
 }
+
+// commitsForDev finds the commit count for a specific developer ID in the activity array.
+func commitsForDev(entries []DeveloperCommits, devID int) int {
+	for _, dc := range entries {
+		if dc.DevID == devID {
+			return dc.Commits
+		}
+	}
+
+	return 0
+}
diff --git a/internal/analyzers/devs/dashboard_languages.go b/internal/analyzers/devs/dashboard_languages.go
index 2bbd6c3..819f7a2 100644
--- a/internal/analyzers/devs/dashboard_languages.go
+++ b/internal/analyzers/devs/dashboard_languages.go
@@ -94,9 +94,10 @@ func topDevsByContribution(data *DashboardData, n int) []DeveloperData {
 
 	for i, dev := range data.Metrics.Developers {
 		total := 0
+		langStats := devLanguageMap(dev)
 
 		for _, lang := range data.TopLanguages {
-			if stats, ok := dev.Languages[lang]; ok {
+			if stats, ok := langStats[lang]; ok {
 				total += stats.Added + stats.Removed
 			}
 		}
@@ -121,10 +122,11 @@ func topDevsByContribution(data *DashboardData, n int) []DeveloperData {
 // devContribution returns the total contribution (Added+Removed) for a developer
 // across the given languages.
 func devContribution(dev DeveloperData, langs []string) map[string]int {
+	langStats := devLanguageMap(dev)
 	result := make(map[string]int, len(langs))
 
 	for _, lang := range langs {
-		if stats, ok := dev.Languages[lang]; ok {
+		if stats, ok := langStats[lang]; ok {
 			result[lang] = stats.Added + stats.Removed
 		}
 	}
@@ -132,6 +134,17 @@ func devContribution(dev DeveloperData, langs []string) map[string]int {
 	return result
 }
 
+// devLanguageMap builds a language-name lookup map from a developer's Languages slice.
+func devLanguageMap(dev DeveloperData) map[string]LanguageStatsEntry {
+	m := make(map[string]LanguageStatsEntry, len(dev.Languages))
+
+	for _, entry := range dev.Languages {
+		m[entry.Language] = entry
+	}
+
+	return m
+}
+
 // buildRadarData computes per-developer relative expertise profiles.
 // Each developer is normalized independently: their strongest language = 100%,
 // and all other languages are relative to that. This produces visually distinct
diff --git a/internal/analyzers/devs/dashboard_workload.go b/internal/analyzers/devs/dashboard_workload.go
index 45d0442..f26abd4 100644
--- a/internal/analyzers/devs/dashboard_workload.go
+++ b/internal/analyzers/devs/dashboard_workload.go
@@ -110,10 +110,10 @@ func findPrimaryLanguage(dev DeveloperData) string {
 	primaryLang := langOther
 	maxLines := 0
 
-	for lang, stats := range dev.Languages {
-		if stats.Added > maxLines {
-			maxLines = stats.Added
-			primaryLang = lang
+	for _, entry := range dev.Languages {
+		if entry.Added > maxLines {
+			maxLines = entry.Added
+			primaryLang = entry.Language
 
 			if primaryLang == "" {
 				primaryLang = langOther
diff --git a/internal/analyzers/devs/metrics.go b/internal/analyzers/devs/metrics.go
index 0e48062..6a80183 100644
--- a/internal/analyzers/devs/metrics.go
+++ b/internal/analyzers/devs/metrics.go
@@ -38,10 +38,11 @@ func devIDBytes(id int) []byte {
 
 // TickData is the raw input data for devs metrics computation.
 type TickData struct {
-	Ticks     map[int]map[int]*DevTick
-	Names     []string
-	TickSize  time.Duration
-	DevSketch *hll.Sketch `json:"-" yaml:"-"`
+	Ticks      map[int]map[int]*DevTick
+	Names      []string
+	TickSize   time.Duration
+	TickBounds map[int]analyze.TickBounds
+	DevSketch  *hll.Sketch `json:"-" yaml:"-"`
 }
 
 // AggregateCommitsToTicks builds per-tick per-developer data from per-commit
@@ -127,6 +128,10 @@ func ParseTickDataWithPrecision(report analyze.Report, precision int) (*TickData
 		TickSize: tickSize,
 	}
 
+	if v, ok := report["tick_bounds"].(map[int]analyze.TickBounds); ok {
+		td.TickBounds = v
+	}
+
 	td.DevSketch = buildDevSketchWithPrecision(ticks, precision)
 
 	return td, nil
@@ -307,17 +312,55 @@ func buildCommitsByTickFromMap(cbtMap map[string]any) map[int][]gitlib.Hash {
 
 // DeveloperData contains computed data for a single developer.
 type DeveloperData struct {
-	ID          int                              `json:"id"            yaml:"id"`
-	Name        string                           `json:"name"          yaml:"name"`
-	Commits     int                              `json:"commits"       yaml:"commits"`
-	Added       int                              `json:"lines_added"   yaml:"lines_added"`
-	Removed     int                              `json:"lines_removed" yaml:"lines_removed"`
-	Changed     int                              `json:"lines_changed" yaml:"lines_changed"`
-	NetLines    int                              `json:"net_lines"     yaml:"net_lines"`
-	Languages   map[string]pkgplumbing.LineStats `json:"languages"     yaml:"languages"`
-	FirstTick   int                              `json:"first_tick"    yaml:"first_tick"`
-	LastTick    int                              `json:"last_tick"     yaml:"last_tick"`
-	ActiveTicks int                              `json:"active_ticks"  yaml:"active_ticks"`
+	ID          int                  `json:"id"              yaml:"id"`
+	Name        string               `json:"name"            yaml:"name"`
+	Email       string               `json:"email,omitempty" yaml:"email,omitempty"`
+	Commits     int                  `json:"commits"         yaml:"commits"`
+	Added       int                  `json:"lines_added"     yaml:"lines_added"`
+	Removed     int                  `json:"lines_removed"   yaml:"lines_removed"`
+	Changed     int                  `json:"lines_changed"   yaml:"lines_changed"`
+	NetLines    int                  `json:"net_lines"       yaml:"net_lines"`
+	Languages   []LanguageStatsEntry `json:"languages"       yaml:"languages"`
+	FirstTick   int                  `json:"first_tick"      yaml:"first_tick"`
+	LastTick    int                  `json:"last_tick"       yaml:"last_tick"`
+	ActiveTicks int                  `json:"active_ticks"    yaml:"active_ticks"`
+
+	// langMap is the internal accumulation map, converted to Languages by finalizeLanguages.
+	langMap map[string]pkgplumbing.LineStats `json:"-" yaml:"-"`
+}
+
+// LanguageStatsEntry holds line stats for a single language.
+type LanguageStatsEntry struct {
+	Language string `json:"language" yaml:"language"`
+	Added    int    `json:"added"    yaml:"added"`
+	Removed  int    `json:"removed"  yaml:"removed"`
+	Changed  int    `json:"changed"  yaml:"changed"`
+}
+
+// finalizeLanguages converts the internal langMap to a sorted Languages slice.
+func (d *DeveloperData) finalizeLanguages() {
+	if len(d.langMap) == 0 {
+		return
+	}
+
+	d.Languages = make([]LanguageStatsEntry, 0, len(d.langMap))
+
+	for lang, stats := range d.langMap {
+		if lang == "" {
+			lang = "Other"
+		}
+
+		d.Languages = append(d.Languages, LanguageStatsEntry{
+			Language: lang,
+			Added:    stats.Added,
+			Removed:  stats.Removed,
+			Changed:  stats.Changed,
+		})
+	}
+
+	sort.Slice(d.Languages, func(i, j int) bool {
+		return d.Languages[i].Language < d.Languages[j].Language
+	})
 }
 
 // LanguageData contains computed data for a programming language.
@@ -337,26 +380,38 @@ type BusFactorData struct {
 	TotalContributors int     `json:"total_contributors"             yaml:"total_contributors"`
 	PrimaryDevID      int     `json:"primary_dev_id"                 yaml:"primary_dev_id"`
 	PrimaryDevName    string  `json:"primary_dev_name"               yaml:"primary_dev_name"`
+	PrimaryDevEmail   string  `json:"primary_dev_email,omitempty"    yaml:"primary_dev_email,omitempty"`
 	PrimaryPct        float64 `json:"primary_percentage"             yaml:"primary_percentage"`
 	SecondaryDevID    int     `json:"secondary_dev_id,omitempty"     yaml:"secondary_dev_id,omitempty"`
 	SecondaryDevName  string  `json:"secondary_dev_name,omitempty"   yaml:"secondary_dev_name,omitempty"`
+	SecondaryDevEmail string  `json:"secondary_dev_email,omitempty"  yaml:"secondary_dev_email,omitempty"`
 	SecondaryPct      float64 `json:"secondary_percentage,omitempty" yaml:"secondary_percentage,omitempty"`
 	RiskLevel         string  `json:"risk_level"                     yaml:"risk_level"`
 }
 
+// DeveloperCommits holds a developer's commit count within a single tick.
+type DeveloperCommits struct {
+	DevID   int `json:"dev_id"  yaml:"dev_id"`
+	Commits int `json:"commits" yaml:"commits"`
+}
+
 // ActivityData contains time-series activity for a single tick.
 type ActivityData struct {
-	Tick         int         `json:"tick"          yaml:"tick"`
-	ByDeveloper  map[int]int `json:"by_developer"  yaml:"by_developer"`
-	TotalCommits int         `json:"total_commits" yaml:"total_commits"`
+	Tick         int                `json:"tick"                 yaml:"tick"`
+	StartTime    string             `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime      string             `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	ByDeveloper  []DeveloperCommits `json:"by_developer"         yaml:"by_developer"`
+	TotalCommits int                `json:"total_commits"        yaml:"total_commits"`
 }
 
 // ChurnData contains code churn for a single tick.
 type ChurnData struct {
-	Tick    int `json:"tick"          yaml:"tick"`
-	Added   int `json:"lines_added"   yaml:"lines_added"`
-	Removed int `json:"lines_removed" yaml:"lines_removed"`
-	Net     int `json:"net_change"    yaml:"net_change"`
+	Tick      int    `json:"tick"                 yaml:"tick"`
+	StartTime string `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime   string `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	Added     int    `json:"lines_added"          yaml:"lines_added"`
+	Removed   int    `json:"lines_removed"        yaml:"lines_removed"`
+	Net       int    `json:"net_change"           yaml:"net_change"`
 }
 
 // AggregateData contains summary statistics.
@@ -420,10 +475,12 @@ func processTickDevs(tick int, devTicks map[int]*DevTick, devMap map[int]*Develo
 func getOrCreateDev(devID, tick int, devMap map[int]*DeveloperData, names []string) *DeveloperData {
 	dev := devMap[devID]
 	if dev == nil {
+		name, email := devNameAndEmail(devID, names)
 		dev = &DeveloperData{
 			ID:        devID,
-			Name:      devName(devID, names),
-			Languages: make(map[string]pkgplumbing.LineStats),
+			Name:      name,
+			Email:     email,
+			langMap:   make(map[string]pkgplumbing.LineStats),
 			FirstTick: tick,
 			LastTick:  tick,
 		}
@@ -448,7 +505,7 @@ func updateDevStats(dev *DeveloperData, dt *DevTick, tick int) {
 		dev.LastTick = tick
 	}
 
-	mergeLanguageStats(dev.Languages, dt.Languages)
+	mergeLanguageStats(dev.langMap, dt.Languages)
 }
 
 func mergeLanguageStats(target, source map[string]pkgplumbing.LineStats) {
@@ -467,6 +524,7 @@ func collectDevResults(devMap map[int]*DeveloperData) []DeveloperData {
 
 	for _, dev := range devMap {
 		dev.NetLines = dev.Added - dev.Removed
+		dev.finalizeLanguages()
 		result = append(result, *dev)
 	}
 
@@ -496,7 +554,8 @@ func (m *LanguagesMetric) Compute(developers []DeveloperData) []LanguageData {
 	langMap := make(map[string]*LanguageData)
 
 	for _, dev := range developers {
-		for lang, langSt := range dev.Languages {
+		for _, langEntry := range dev.Languages {
+			lang := langEntry.Language
 			if lang == "" {
 				lang = "Other"
 			}
@@ -510,8 +569,8 @@ func (m *LanguagesMetric) Compute(developers []DeveloperData) []LanguageData {
 				langMap[lang] = ld
 			}
 
-			ld.TotalLines += langSt.Added
-			contribution := langSt.Added + langSt.Removed
+			ld.TotalLines += langEntry.Added
+			contribution := langEntry.Added + langEntry.Removed
 			ld.TotalContribution += contribution
 			ld.Contributors[dev.ID] += contribution
 		}
@@ -612,13 +671,13 @@ func (m *BusFactorMetric) ComputeWithOptions(input BusFactorInput, opts MetricOp
 
 		if len(contribs) > 0 {
 			bf.PrimaryDevID = contribs[0].id
-			bf.PrimaryDevName = devName(contribs[0].id, input.Names)
+			bf.PrimaryDevName, bf.PrimaryDevEmail = devNameAndEmail(contribs[0].id, input.Names)
 			bf.PrimaryPct = stats.ToPercent(float64(contribs[0].lines) / float64(ld.TotalContribution))
 		}
 
 		if len(contribs) > 1 {
 			bf.SecondaryDevID = contribs[1].id
-			bf.SecondaryDevName = devName(contribs[1].id, input.Names)
+			bf.SecondaryDevName, bf.SecondaryDevEmail = devNameAndEmail(contribs[1].id, input.Names)
 			bf.SecondaryPct = stats.ToPercent(float64(contribs[1].lines) / float64(ld.TotalContribution))
 		}
 
@@ -692,16 +751,22 @@ func (m *ActivityMetric) Compute(input *TickData) []ActivityData {
 	result := make([]ActivityData, len(tickKeys))
 
 	for i, tick := range tickKeys {
-		ad := ActivityData{
-			Tick:        tick,
-			ByDeveloper: make(map[int]int),
-		}
+		ad := ActivityData{Tick: tick}
+
+		devIDs := mapx.SortedKeys(input.Ticks[tick])
+		ad.ByDeveloper = make([]DeveloperCommits, 0, len(devIDs))
 
-		for devID, dt := range input.Ticks[tick] {
-			ad.ByDeveloper[devID] = dt.Commits
+		for _, devID := range devIDs {
+			dt := input.Ticks[tick][devID]
+			ad.ByDeveloper = append(ad.ByDeveloper, DeveloperCommits{DevID: devID, Commits: dt.Commits})
 			ad.TotalCommits += dt.Commits
 		}
 
+		if bounds, hasBounds := input.TickBounds[tick]; hasBounds {
+			ad.StartTime = bounds.FormatStartTime()
+			ad.EndTime = bounds.FormatEndTime()
+		}
+
 		result[i] = ad
 	}
 
@@ -742,6 +807,11 @@ func (m *ChurnMetric) Compute(input *TickData) []ChurnData {
 
 		cd.Net = cd.Added - cd.Removed
 
+		if bounds, hasBounds := input.TickBounds[tick]; hasBounds {
+			cd.StartTime = bounds.FormatStartTime()
+			cd.EndTime = bounds.FormatEndTime()
+		}
+
 		result[i] = cd
 	}
 
@@ -1063,14 +1133,14 @@ func (m *ComputedMetrics) ToYAML() any {
 
 const defaultTickHours = 24
 
-func devName(id int, names []string) string {
+func devNameAndEmail(id int, names []string) (name, email string) {
 	if id == identity.AuthorMissing {
-		return identity.AuthorMissingName
+		return identity.AuthorMissingName, ""
 	}
 
 	if id >= 0 && id < len(names) {
-		return names[id]
+		return identity.SplitIdentity(names[id])
 	}
 
-	return fmt.Sprintf("dev_%d", id)
+	return fmt.Sprintf("dev_%d", id), ""
 }
diff --git a/internal/analyzers/devs/metrics_test.go b/internal/analyzers/devs/metrics_test.go
index 4d18301..32b8b42 100644
--- a/internal/analyzers/devs/metrics_test.go
+++ b/internal/analyzers/devs/metrics_test.go
@@ -30,6 +30,21 @@ const (
 	testHashB        = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
 )
 
+// findLang finds a LanguageStatsEntry by name in a developer's Languages slice.
+func findLang(t *testing.T, langs []LanguageStatsEntry, name string) LanguageStatsEntry {
+	t.Helper()
+
+	for _, l := range langs {
+		if l.Language == name {
+			return l
+		}
+	}
+
+	t.Fatalf("language %q not found", name)
+
+	return LanguageStatsEntry{}
+}
+
 // --- ParseTickData Tests ---.
 
 func TestParseTickData_Valid(t *testing.T) {
@@ -247,12 +262,15 @@ func TestDevelopersMetric_LanguageAggregation(t *testing.T) {
 
 	require.Len(t, result, 1)
 	require.NotNil(t, result[0].Languages)
-	assert.Equal(t, 70, result[0].Languages[testLangGo].Added)   // 50 + 20
-	assert.Equal(t, 15, result[0].Languages[testLangGo].Removed) // 10 + 5
-	assert.Equal(t, 8, result[0].Languages[testLangGo].Changed)  // 5 + 3
-	assert.Equal(t, 30, result[0].Languages[testLangPython].Added)
-	assert.Equal(t, 5, result[0].Languages[testLangPython].Removed)
-	assert.Equal(t, 2, result[0].Languages[testLangPython].Changed)
+	goLang := findLang(t, result[0].Languages, testLangGo)
+	assert.Equal(t, 70, goLang.Added)   // 50 + 20
+	assert.Equal(t, 15, goLang.Removed) // 10 + 5
+	assert.Equal(t, 8, goLang.Changed)  // 5 + 3
+
+	pyLang := findLang(t, result[0].Languages, testLangPython)
+	assert.Equal(t, 30, pyLang.Added)
+	assert.Equal(t, 5, pyLang.Removed)
+	assert.Equal(t, 2, pyLang.Changed)
 }
 
 func TestDevelopersMetric_ChangedField(t *testing.T) {
@@ -306,7 +324,7 @@ func TestLanguagesMetric_SingleLanguage(t *testing.T) {
 	developers := []DeveloperData{
 		{
 			ID:        0,
-			Languages: map[string]pkgplumbing.LineStats{testLangGo: {Added: testLinesAdded}},
+			Languages: []LanguageStatsEntry{{Language: testLangGo, Added: testLinesAdded}},
 		},
 	}
 	metric := NewLanguagesMetric()
@@ -326,9 +344,9 @@ func TestLanguagesMetric_MultipleLanguages_SortedByTotalLines(t *testing.T) {
 	developers := []DeveloperData{
 		{
 			ID: 0,
-			Languages: map[string]pkgplumbing.LineStats{
-				testLangGo:     {Added: 50},
-				testLangPython: {Added: 150},
+			Languages: []LanguageStatsEntry{
+				{Language: testLangGo, Added: 50},
+				{Language: testLangPython, Added: 150},
 			},
 		},
 	}
@@ -350,7 +368,7 @@ func TestLanguagesMetric_EmptyLanguageName_BecomesOther(t *testing.T) {
 	developers := []DeveloperData{
 		{
 			ID:        0,
-			Languages: map[string]pkgplumbing.LineStats{"": {Added: testLinesAdded}},
+			Languages: []LanguageStatsEntry{{Language: "", Added: testLinesAdded}},
 		},
 	}
 	metric := NewLanguagesMetric()
@@ -365,8 +383,8 @@ func TestLanguagesMetric_MultipleContributors(t *testing.T) {
 	t.Parallel()
 
 	developers := []DeveloperData{
-		{ID: 0, Languages: map[string]pkgplumbing.LineStats{testLangGo: {Added: 60}}},
-		{ID: 1, Languages: map[string]pkgplumbing.LineStats{testLangGo: {Added: 40}}},
+		{ID: 0, Languages: []LanguageStatsEntry{{Language: testLangGo, Added: 60}}},
+		{ID: 1, Languages: []LanguageStatsEntry{{Language: testLangGo, Added: 40}}},
 	}
 	metric := NewLanguagesMetric()
 
@@ -384,8 +402,8 @@ func TestLanguagesMetric_ContributionIncludesRemoved(t *testing.T) {
 	t.Parallel()
 
 	developers := []DeveloperData{
-		{ID: 0, Languages: map[string]pkgplumbing.LineStats{testLangGo: {Added: 60, Removed: 40}}},
-		{ID: 1, Languages: map[string]pkgplumbing.LineStats{testLangGo: {Added: 10, Removed: 90}}},
+		{ID: 0, Languages: []LanguageStatsEntry{{Language: testLangGo, Added: 60, Removed: 40}}},
+		{ID: 1, Languages: []LanguageStatsEntry{{Language: testLangGo, Added: 10, Removed: 90}}},
 	}
 	metric := NewLanguagesMetric()
 
@@ -590,8 +608,11 @@ func TestActivityMetric_SingleTick(t *testing.T) {
 	require.Len(t, result, 1)
 	assert.Equal(t, 0, result[0].Tick)
 	assert.Equal(t, 8, result[0].TotalCommits)
-	assert.Equal(t, 5, result[0].ByDeveloper[0])
-	assert.Equal(t, 3, result[0].ByDeveloper[1])
+	require.Len(t, result[0].ByDeveloper, 2)
+	assert.Equal(t, 0, result[0].ByDeveloper[0].DevID)
+	assert.Equal(t, 5, result[0].ByDeveloper[0].Commits)
+	assert.Equal(t, 1, result[0].ByDeveloper[1].DevID)
+	assert.Equal(t, 3, result[0].ByDeveloper[1].Commits)
 }
 
 func TestActivityMetric_MultipleTicks(t *testing.T) {
@@ -1011,12 +1032,16 @@ func TestParseCommitsByTick_FromMap(t *testing.T) {
 	require.Len(t, result, 1)
 }
 
-func TestDevName_Variants(t *testing.T) {
+func TestDevNameAndEmail_Variants(t *testing.T) {
 	t.Parallel()
 
 	names := []string{"Alice", "Bob"}
 
-	assert.Equal(t, "Alice", devName(0, names))
-	assert.Equal(t, "Bob", devName(1, names))
-	assert.Contains(t, devName(99, names), "dev_99")
+	name0, _ := devNameAndEmail(0, names)
+	name1, _ := devNameAndEmail(1, names)
+	name99, _ := devNameAndEmail(99, names)
+
+	assert.Equal(t, "Alice", name0)
+	assert.Equal(t, "Bob", name1)
+	assert.Contains(t, name99, "dev_99")
 }
diff --git a/internal/analyzers/devs/plot.go b/internal/analyzers/devs/plot.go
index 8cb915f..7ac33ca 100644
--- a/internal/analyzers/devs/plot.go
+++ b/internal/analyzers/devs/plot.go
@@ -59,7 +59,7 @@ func buildTopDevBarSeries(activity []ActivityData, topDevs []int, nameByID map[i
 	for _, devID := range topDevs {
 		data := make([]plotpage.SeriesData, len(activity))
 		for i, ad := range activity {
-			data[i] = ad.ByDeveloper[devID]
+			data[i] = commitsForDev(ad.ByDeveloper, devID)
 		}
 
 		name := nameByID[devID]
@@ -92,9 +92,9 @@ func buildOthersBarSeries(activity []ActivityData, topDevs []int) plotpage.BarSe
 	for i, ad := range activity {
 		total := 0
 
-		for devID, commits := range ad.ByDeveloper {
-			if !topDevsSet[devID] {
-				total += commits
+		for _, dc := range ad.ByDeveloper {
+			if !topDevsSet[dc.DevID] {
+				total += dc.Commits
 			}
 		}
 
diff --git a/internal/analyzers/devs/store_writer_test.go b/internal/analyzers/devs/store_writer_test.go
index 2a76beb..5d3c834 100644
--- a/internal/analyzers/devs/store_writer_test.go
+++ b/internal/analyzers/devs/store_writer_test.go
@@ -1,7 +1,5 @@
 package devs
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/file_history/aggregator.go b/internal/analyzers/file_history/aggregator.go
index 44168de..aa9b0cc 100644
--- a/internal/analyzers/file_history/aggregator.go
+++ b/internal/analyzers/file_history/aggregator.go
@@ -411,6 +411,8 @@ func TicksToReport(ctx context.Context, ticks []analyze.TICK, repo *gitlib.Repos
 		report["tick_composition"] = tickComposition
 	}
 
+	report["tick_bounds"] = analyze.BuildTickBounds(ticks)
+
 	return report
 }
 
diff --git a/internal/analyzers/file_history/metrics.go b/internal/analyzers/file_history/metrics.go
index 2e1b6e9..782d85a 100644
--- a/internal/analyzers/file_history/metrics.go
+++ b/internal/analyzers/file_history/metrics.go
@@ -4,7 +4,6 @@ import (
 	"sort"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
-	pkgplumbing "github.com/Sumatoshi-tech/codefang/internal/plumbing"
 	"github.com/Sumatoshi-tech/codefang/pkg/metrics"
 )
 
@@ -38,12 +37,20 @@ type FileChurnData struct {
 	ChurnScore       float64 `json:"churn_score"         yaml:"churn_score"`
 }
 
-// FileContributorData contains contributor statistics for a file.
+// ContributorEntry holds line stats for a single contributor to a file.
+type ContributorEntry struct {
+	DevID   int `json:"dev_id"  yaml:"dev_id"`
+	Added   int `json:"added"   yaml:"added"`
+	Removed int `json:"removed" yaml:"removed"`
+	Changed int `json:"changed" yaml:"changed"`
+}
+
+// FileContributorData contains contributor breakdown for a file.
 type FileContributorData struct {
-	Path                string                        `json:"path"                  yaml:"path"`
-	Contributors        map[int]pkgplumbing.LineStats `json:"contributors"          yaml:"contributors"`
-	TopContributorID    int                           `json:"top_contributor_id"    yaml:"top_contributor_id"`
-	TopContributorLines int                           `json:"top_contributor_lines" yaml:"top_contributor_lines"`
+	Path                string             `json:"path"                  yaml:"path"`
+	Contributors        []ContributorEntry `json:"contributors"          yaml:"contributors"`
+	TopContributorID    int                `json:"top_contributor_id"    yaml:"top_contributor_id"`
+	TopContributorLines int                `json:"top_contributor_lines" yaml:"top_contributor_lines"`
 }
 
 // HotspotData identifies high-churn files that may need attention.
@@ -84,8 +91,10 @@ type CompositionData struct {
 
 // CompositionTimeSeriesEntry holds file composition for a single tick.
 type CompositionTimeSeriesEntry struct {
-	Tick      int            `json:"tick"      yaml:"tick"`
-	Breakdown map[string]int `json:"breakdown" yaml:"breakdown"`
+	Tick      int            `json:"tick"                 yaml:"tick"`
+	StartTime string         `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime   string         `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	Breakdown map[string]int `json:"breakdown"            yaml:"breakdown"`
 }
 
 // --- Computed Metrics ---.
@@ -150,7 +159,12 @@ func ComputeAllMetricsWithOptions(report analyze.Report, opts MetricOptions) (*C
 		tickComp = nil
 	}
 
-	composition, compositionTS := computeComposition(tickComp)
+	var tickBounds map[int]analyze.TickBounds
+	if v, tbOK := report["tick_bounds"].(map[int]analyze.TickBounds); tbOK {
+		tickBounds = v
+	}
+
+	composition, compositionTS := computeComposition(tickComp, tickBounds)
 
 	return &ComputedMetrics{
 		FileChurn:        computeFileChurn(input),
@@ -204,7 +218,16 @@ func computeFileContributors(input *ReportData) []FileContributorData {
 	for path, fh := range input.Files {
 		var topID, topLines int
 
+		contribs := make([]ContributorEntry, 0, len(fh.People))
+
 		for devID, stats := range fh.People {
+			contribs = append(contribs, ContributorEntry{
+				DevID:   devID,
+				Added:   stats.Added,
+				Removed: stats.Removed,
+				Changed: stats.Changed,
+			})
+
 			totalLines := stats.Added + stats.Changed
 			if totalLines > topLines {
 				topLines = totalLines
@@ -212,9 +235,13 @@ func computeFileContributors(input *ReportData) []FileContributorData {
 			}
 		}
 
+		sort.Slice(contribs, func(i, j int) bool {
+			return contribs[i].DevID < contribs[j].DevID
+		})
+
 		result = append(result, FileContributorData{
 			Path:                path,
-			Contributors:        fh.People,
+			Contributors:        contribs,
 			TopContributorID:    topID,
 			TopContributorLines: topLines,
 		})
@@ -278,7 +305,10 @@ func computeHotspotsWithOptions(input *ReportData, opts MetricOptions) []Hotspot
 	return result
 }
 
-func computeComposition(tickComp map[int]*CategoryCounts) (CompositionData, []CompositionTimeSeriesEntry) {
+func computeComposition(
+	tickComp map[int]*CategoryCounts,
+	tickBounds map[int]analyze.TickBounds,
+) (CompositionData, []CompositionTimeSeriesEntry) {
 	comp := CompositionData{
 		Breakdown:   make(map[string]int),
 		Percentages: make(map[string]float64),
@@ -312,10 +342,17 @@ func computeComposition(tickComp map[int]*CategoryCounts) (CompositionData, []Co
 			}
 		}
 
-		ts = append(ts, CompositionTimeSeriesEntry{
+		entry := CompositionTimeSeriesEntry{
 			Tick:      t,
 			Breakdown: breakdown,
-		})
+		}
+
+		if bounds, hasBounds := tickBounds[t]; hasBounds {
+			entry.StartTime = bounds.FormatStartTime()
+			entry.EndTime = bounds.FormatEndTime()
+		}
+
+		ts = append(ts, entry)
 	}
 
 	// Aggregate breakdown and percentages.
diff --git a/internal/analyzers/file_history/store_writer.go b/internal/analyzers/file_history/store_writer.go
index fb4c236..34fb3cd 100644
--- a/internal/analyzers/file_history/store_writer.go
+++ b/internal/analyzers/file_history/store_writer.go
@@ -63,7 +63,7 @@ func (h *HistoryAnalyzer) WriteToStoreFromAggregator(
 
 	// Write composition time series if available.
 	if len(fa.tickComposition) > 0 {
-		_, compositionTS := computeComposition(fa.tickComposition)
+		_, compositionTS := computeComposition(fa.tickComposition, nil)
 
 		compErr := analyze.WriteSliceKind(w, KindComposition, compositionTS)
 		if compErr != nil {
diff --git a/internal/analyzers/file_history/store_writer_test.go b/internal/analyzers/file_history/store_writer_test.go
index 746547f..74847a1 100644
--- a/internal/analyzers/file_history/store_writer_test.go
+++ b/internal/analyzers/file_history/store_writer_test.go
@@ -1,7 +1,5 @@
 package filehistory
 
-// FRD: specs/frds/FRD-20260301-burndown-filehistory-store-writer.md.
-
 import (
 	"context"
 	"fmt"
diff --git a/internal/analyzers/halstead/aggregator.go b/internal/analyzers/halstead/aggregator.go
index 0cc97e6..8853444 100644
--- a/internal/analyzers/halstead/aggregator.go
+++ b/internal/analyzers/halstead/aggregator.go
@@ -15,6 +15,7 @@ const (
 // Aggregator aggregates Halstead analysis results.
 type Aggregator struct {
 	*common.Aggregator
+	common.PerFileRetainer
 	detailed *common.DetailedDataCollector
 }
 
@@ -48,6 +49,10 @@ func (ha *Aggregator) SetAggregationMode(mode analyze.AggregationMode) {
 
 // Aggregate overrides the base Aggregate method to collect detailed functions.
 func (ha *Aggregator) Aggregate(results map[string]analyze.Report) {
+	for _, report := range results {
+		ha.Retain(report)
+	}
+
 	ha.detailed.CollectFromReports(results)
 	ha.Aggregator.Aggregate(results)
 }
diff --git a/internal/analyzers/halstead/aggregator_bench_test.go b/internal/analyzers/halstead/aggregator_bench_test.go
index 50ab30a..ecc23ed 100644
--- a/internal/analyzers/halstead/aggregator_bench_test.go
+++ b/internal/analyzers/halstead/aggregator_bench_test.go
@@ -1,7 +1,5 @@
 package halstead
 
-// FRD: specs/frds/FRD-20260311-halstead-dedup.md.
-
 import (
 	"fmt"
 	"testing"
diff --git a/internal/analyzers/halstead/aggregator_test.go b/internal/analyzers/halstead/aggregator_test.go
index 4e49108..353e566 100644
--- a/internal/analyzers/halstead/aggregator_test.go
+++ b/internal/analyzers/halstead/aggregator_test.go
@@ -317,8 +317,6 @@ func TestBuildEmptyHalsteadResult(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260311-halstead-dedup.md.
-
 func TestAggregator_DuplicateFuncNames_PreservedAcrossFiles(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/analyzers/halstead/cms_test.go b/internal/analyzers/halstead/cms_test.go
index e52499e..93d6399 100644
--- a/internal/analyzers/halstead/cms_test.go
+++ b/internal/analyzers/halstead/cms_test.go
@@ -72,7 +72,7 @@ func TestVisitor_CMSSketchPopulated_LargeFunction(t *testing.T) {
 	traverser.Traverse(root)
 
 	// Retrieve function metrics.
-	funcMetrics, ok := visitor.functionMetrics[cmsTestFuncName]
+	funcMetrics, ok := findFunctionMetrics(visitor.functionMetrics, cmsTestFuncName)
 
 	require.True(t, ok, "function metrics must exist")
 	require.NotNil(t, funcMetrics.OperatorSketch, "OperatorSketch should be populated for large function")
@@ -93,7 +93,7 @@ func TestVisitor_CMSNotUsed_SmallFunction(t *testing.T) {
 	traverser.RegisterVisitor(visitor)
 	traverser.Traverse(root)
 
-	funcMetrics, ok := visitor.functionMetrics[cmsTestFuncName]
+	funcMetrics, ok := findFunctionMetrics(visitor.functionMetrics, cmsTestFuncName)
 
 	require.True(t, ok, "function metrics must exist")
 
@@ -112,7 +112,7 @@ func TestVisitor_CMSTotalMatchesExact(t *testing.T) {
 	traverser.RegisterVisitor(visitor)
 	traverser.Traverse(root)
 
-	funcMetrics, ok := visitor.functionMetrics[cmsTestFuncName]
+	funcMetrics, ok := findFunctionMetrics(visitor.functionMetrics, cmsTestFuncName)
 
 	require.True(t, ok, "function metrics must exist")
 
@@ -136,7 +136,7 @@ func TestVisitor_EstimatedFields_Populated(t *testing.T) {
 	traverser.RegisterVisitor(visitor)
 	traverser.Traverse(root)
 
-	funcMetrics, ok := visitor.functionMetrics[cmsTestFuncName]
+	funcMetrics, ok := findFunctionMetrics(visitor.functionMetrics, cmsTestFuncName)
 
 	require.True(t, ok, "function metrics must exist")
 	assert.Positive(t, funcMetrics.EstimatedTotalOperators,
@@ -155,7 +155,7 @@ func TestVisitor_DerivedMetrics_CMSPath(t *testing.T) {
 	traverser.RegisterVisitor(visitor)
 	traverser.Traverse(root)
 
-	funcMetrics, ok := visitor.functionMetrics[cmsTestFuncName]
+	funcMetrics, ok := findFunctionMetrics(visitor.functionMetrics, cmsTestFuncName)
 
 	require.True(t, ok, "function metrics must exist")
 
@@ -334,3 +334,16 @@ func sumMapHelper(m map[string]int) int {
 
 	return sum
 }
+
+// findFunctionMetrics returns the first metrics entry whose Name matches.
+//
+//nolint:unparam // tests pass cmsTestFuncName today; keep signature generic for future callers.
+func findFunctionMetrics(metrics []*FunctionHalsteadMetrics, name string) (*FunctionHalsteadMetrics, bool) {
+	for _, m := range metrics {
+		if m.Name == name {
+			return m, true
+		}
+	}
+
+	return nil, false
+}
diff --git a/internal/analyzers/halstead/halstead.go b/internal/analyzers/halstead/halstead.go
index 107e16d..172445c 100644
--- a/internal/analyzers/halstead/halstead.go
+++ b/internal/analyzers/halstead/halstead.go
@@ -117,21 +117,21 @@ func extractOperandName(target *node.Node) (string, bool) {
 
 // Metrics holds all Halstead complexity measures.
 type Metrics struct {
-	Functions               map[string]*FunctionHalsteadMetrics `json:"functions"`
-	EstimatedLength         float64                             `json:"estimated_length"`
-	EstimatedTotalOperators int64                               `json:"estimated_total_operators" yaml:"estimated_total_operators"`
-	EstimatedTotalOperands  int64                               `json:"estimated_total_operands"  yaml:"estimated_total_operands"`
-	TotalOperators          int                                 `json:"total_operators"`
-	TotalOperands           int                                 `json:"total_operands"`
-	Vocabulary              int                                 `json:"vocabulary"`
-	Length                  int                                 `json:"length"`
-	DistinctOperators       int                                 `json:"distinct_operators"`
-	Volume                  float64                             `json:"volume"`
-	Difficulty              float64                             `json:"difficulty"`
-	Effort                  float64                             `json:"effort"`
-	TimeToProgram           float64                             `json:"time_to_program"`
-	DeliveredBugs           float64                             `json:"delivered_bugs"`
-	DistinctOperands        int                                 `json:"distinct_operands"`
+	Functions               []*FunctionHalsteadMetrics `json:"functions"`
+	EstimatedLength         float64                    `json:"estimated_length"`
+	EstimatedTotalOperators int64                      `json:"estimated_total_operators" yaml:"estimated_total_operators"`
+	EstimatedTotalOperands  int64                      `json:"estimated_total_operands"  yaml:"estimated_total_operands"`
+	TotalOperators          int                        `json:"total_operators"`
+	TotalOperands           int                        `json:"total_operands"`
+	Vocabulary              int                        `json:"vocabulary"`
+	Length                  int                        `json:"length"`
+	DistinctOperators       int                        `json:"distinct_operators"`
+	Volume                  float64                    `json:"volume"`
+	Difficulty              float64                    `json:"difficulty"`
+	Effort                  float64                    `json:"effort"`
+	TimeToProgram           float64                    `json:"time_to_program"`
+	DeliveredBugs           float64                    `json:"delivered_bugs"`
+	DistinctOperands        int                        `json:"distinct_operands"`
 }
 
 // FunctionHalsteadMetrics contains Halstead metrics for a single function.
@@ -159,7 +159,6 @@ type FunctionHalsteadMetrics struct {
 
 // FunctionReportItem is a typed representation of a per-function halstead report item.
 // Includes assessment strings and operator/operand maps. Avoids map[string]any allocation.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 type FunctionReportItem struct {
 	Operators               map[string]int
 	Operands                map[string]int
@@ -328,14 +327,13 @@ func (h *Analyzer) buildEmptyResult(message string) analyze.Report {
 }
 
 // calculateAllFunctionMetrics calculates metrics for all functions.
-func (h *Analyzer) calculateAllFunctionMetrics(functions []*node.Node) map[string]*FunctionHalsteadMetrics {
-	functionMetrics := make(map[string]*FunctionHalsteadMetrics)
+func (h *Analyzer) calculateAllFunctionMetrics(functions []*node.Node) []*FunctionHalsteadMetrics {
+	functionMetrics := make([]*FunctionHalsteadMetrics, 0, len(functions))
 
 	for _, fn := range functions {
-		funcName := h.getFunctionName(fn)
 		funcMetrics := h.calculateFunctionHalsteadMetrics(fn)
-		funcMetrics.Name = funcName
-		functionMetrics[funcName] = funcMetrics
+		funcMetrics.Name = h.getFunctionName(fn)
+		functionMetrics = append(functionMetrics, funcMetrics)
 	}
 
 	return functionMetrics
@@ -352,7 +350,7 @@ func (h *Analyzer) getFunctionName(fn *node.Node) string {
 }
 
 // calculateFileLevelMetrics calculates file-level metrics from function metrics.
-func (h *Analyzer) calculateFileLevelMetrics(functionMetrics map[string]*FunctionHalsteadMetrics) *Metrics {
+func (h *Analyzer) calculateFileLevelMetrics(functionMetrics []*FunctionHalsteadMetrics) *Metrics {
 	fileOperators := make(map[string]int)
 	fileOperands := make(map[string]int)
 
@@ -393,8 +391,7 @@ func (h *Analyzer) aggregateOperatorsAndOperandsFromMetrics(
 }
 
 // buildDetailedFunctionsTable creates the detailed functions table as typed structs.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
-func (h *Analyzer) buildDetailedFunctionsTable(functionMetrics map[string]*FunctionHalsteadMetrics) []FunctionReportItem {
+func (h *Analyzer) buildDetailedFunctionsTable(functionMetrics []*FunctionHalsteadMetrics) []FunctionReportItem {
 	items := make([]FunctionReportItem, 0, len(functionMetrics))
 
 	for _, fn := range functionMetrics {
@@ -468,7 +465,6 @@ func convertHalsteadFunctionItems(items any, sourceFile string) []map[string]any
 }
 
 // buildResult constructs the final analysis result.
-// FRD: specs/frds/FRD-20260311-typed-report-items.md.
 func (h *Analyzer) buildResult(
 	fileMetrics *Metrics, reportItems []FunctionReportItem, totalFunctions int, message string,
 ) analyze.Report {
diff --git a/internal/analyzers/halstead/metrics.go b/internal/analyzers/halstead/metrics.go
index 1ae9685..9e525ef 100644
--- a/internal/analyzers/halstead/metrics.go
+++ b/internal/analyzers/halstead/metrics.go
@@ -32,6 +32,9 @@ type ReportData struct {
 // FunctionData holds Halstead data for a single function.
 type FunctionData struct {
 	Name              string
+	SourceFile        string
+	Language          string
+	Directory         string
 	Volume            float64
 	Difficulty        float64
 	Effort            float64
@@ -130,11 +133,31 @@ func parseReportFunctions(report analyze.Report) []FunctionData {
 
 func parseFunctionData(fn map[string]any) FunctionData {
 	fd := FunctionData{}
+	parseFuncIdentity(&fd, fn)
+	parseFuncHalsteadMetrics(&fd, fn)
 
+	return fd
+}
+
+func parseFuncIdentity(fd *FunctionData, fn map[string]any) {
 	if name, ok := fn["name"].(string); ok {
 		fd.Name = name
 	}
 
+	if sf, ok := fn[analyze.SourceFileKey].(string); ok {
+		fd.SourceFile = sf
+	}
+
+	if lang, ok := fn[analyze.LanguageKey].(string); ok {
+		fd.Language = lang
+	}
+
+	if dir, ok := fn[analyze.DirectoryKey].(string); ok {
+		fd.Directory = dir
+	}
+}
+
+func parseFuncHalsteadMetrics(fd *FunctionData, fn map[string]any) {
 	if v, ok := fn["volume"].(float64); ok {
 		fd.Volume = v
 	}
@@ -182,21 +205,22 @@ func parseFunctionData(fn map[string]any) FunctionData {
 	if v, ok := fn["estimated_length"].(float64); ok {
 		fd.EstimatedLength = v
 	}
-
-	return fd
 }
 
 // --- Output Data Types ---.
 
 // FunctionHalsteadData contains Halstead metrics for a function.
 type FunctionHalsteadData struct {
-	Name            string  `json:"name"             yaml:"name"`
-	Volume          float64 `json:"volume"           yaml:"volume"`
-	Difficulty      float64 `json:"difficulty"       yaml:"difficulty"`
-	Effort          float64 `json:"effort"           yaml:"effort"`
-	TimeToProgram   float64 `json:"time_to_program"  yaml:"time_to_program"`
-	DeliveredBugs   float64 `json:"delivered_bugs"   yaml:"delivered_bugs"`
-	ComplexityLevel string  `json:"complexity_level" yaml:"complexity_level"`
+	Name            string  `json:"name"                  yaml:"name"`
+	SourceFile      string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language        string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory       string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	Volume          float64 `json:"volume"                yaml:"volume"`
+	Difficulty      float64 `json:"difficulty"            yaml:"difficulty"`
+	Effort          float64 `json:"effort"                yaml:"effort"`
+	TimeToProgram   float64 `json:"time_to_program"       yaml:"time_to_program"`
+	DeliveredBugs   float64 `json:"delivered_bugs"        yaml:"delivered_bugs"`
+	ComplexityLevel string  `json:"complexity_level"      yaml:"complexity_level"`
 }
 
 // EffortDistributionData contains effort distribution counts.
@@ -209,12 +233,15 @@ type EffortDistributionData struct {
 
 // HighEffortFunctionData identifies functions with high effort.
 type HighEffortFunctionData struct {
-	Name          string  `json:"name"            yaml:"name"`
-	Volume        float64 `json:"volume"          yaml:"volume"`
-	Effort        float64 `json:"effort"          yaml:"effort"`
-	TimeToProgram float64 `json:"time_to_program" yaml:"time_to_program"`
-	DeliveredBugs float64 `json:"delivered_bugs"  yaml:"delivered_bugs"`
-	RiskLevel     string  `json:"risk_level"      yaml:"risk_level"`
+	Name          string  `json:"name"                  yaml:"name"`
+	SourceFile    string  `json:"source_file,omitempty" yaml:"source_file,omitempty"`
+	Language      string  `json:"language,omitempty"    yaml:"language,omitempty"`
+	Directory     string  `json:"directory,omitempty"   yaml:"directory,omitempty"`
+	Volume        float64 `json:"volume"                yaml:"volume"`
+	Effort        float64 `json:"effort"                yaml:"effort"`
+	TimeToProgram float64 `json:"time_to_program"       yaml:"time_to_program"`
+	DeliveredBugs float64 `json:"delivered_bugs"        yaml:"delivered_bugs"`
+	RiskLevel     string  `json:"risk_level"            yaml:"risk_level"`
 }
 
 // AggregateData contains summary statistics.
@@ -303,6 +330,9 @@ func (m *FunctionHalsteadMetric) Compute(input *ReportData) []FunctionHalsteadDa
 
 		result = append(result, FunctionHalsteadData{
 			Name:            fn.Name,
+			SourceFile:      fn.SourceFile,
+			Language:        fn.Language,
+			Directory:       fn.Directory,
 			Volume:          fn.Volume,
 			Difficulty:      fn.Difficulty,
 			Effort:          fn.Effort,
@@ -405,6 +435,7 @@ func (m *HighEffortFunctionMetric) Compute(input *ReportData) []HighEffortFuncti
 
 		result = append(result, HighEffortFunctionData{
 			Name:          fn.Name,
+			SourceFile:    fn.SourceFile,
 			Volume:        fn.Volume,
 			Effort:        fn.Effort,
 			TimeToProgram: fn.TimeToProgram,
diff --git a/internal/analyzers/halstead/report_section.go b/internal/analyzers/halstead/report_section.go
index 430ece5..c572606 100644
--- a/internal/analyzers/halstead/report_section.go
+++ b/internal/analyzers/halstead/report_section.go
@@ -160,6 +160,7 @@ func (s *ReportSection) halsteadIssues(limit int) []analyze.Issue {
 		bugs := reportutil.GetFloat64(fn, KeyFuncBugs)
 		issues = append(issues, analyze.Issue{
 			Name:     name,
+			Location: reportutil.MapString(fn, analyze.SourceFileKey),
 			Value:    formatIssueValue(effort, volume, bugs),
 			Severity: severityForFunction(effort, bugs),
 		})
diff --git a/internal/analyzers/halstead/visitor.go b/internal/analyzers/halstead/visitor.go
index 14fc0d6..c128f4b 100644
--- a/internal/analyzers/halstead/visitor.go
+++ b/internal/analyzers/halstead/visitor.go
@@ -16,7 +16,7 @@ type halsteadContext struct {
 type Visitor struct {
 	metrics         *MetricsCalculator
 	detector        *OperatorOperandDetector
-	functionMetrics map[string]*FunctionHalsteadMetrics
+	functionMetrics []*FunctionHalsteadMetrics
 	contexts        *common.ContextStack[*halsteadContext]
 	nodeStack       *common.ContextStack[*node.Node]
 }
@@ -24,11 +24,10 @@ type Visitor struct {
 // NewVisitor creates a new Visitor.
 func NewVisitor() *Visitor {
 	return &Visitor{
-		contexts:        common.NewContextStack[*halsteadContext](),
-		metrics:         NewMetricsCalculator(),
-		detector:        NewOperatorOperandDetector(),
-		functionMetrics: make(map[string]*FunctionHalsteadMetrics),
-		nodeStack:       common.NewContextStack[*node.Node](),
+		contexts:  common.NewContextStack[*halsteadContext](),
+		metrics:   NewMetricsCalculator(),
+		detector:  NewOperatorOperandDetector(),
+		nodeStack: common.NewContextStack[*node.Node](),
 	}
 }
 
@@ -132,7 +131,7 @@ func (v *Visitor) popContext() {
 	v.metrics.CalculateHalsteadMetrics(ctx.metrics)
 
 	// Store result.
-	v.functionMetrics[ctx.metrics.Name] = ctx.metrics
+	v.functionMetrics = append(v.functionMetrics, ctx.metrics)
 }
 
 func (v *Visitor) currentContext() *halsteadContext {
diff --git a/internal/analyzers/halstead/visitor_dedup_test.go b/internal/analyzers/halstead/visitor_dedup_test.go
new file mode 100644
index 0000000..757b106
--- /dev/null
+++ b/internal/analyzers/halstead/visitor_dedup_test.go
@@ -0,0 +1,59 @@
+package halstead
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/pkg/uast/pkg/node"
+)
+
+// TestVisitor_CountsAllSameNameFunctions guards against the regression where
+// per-function metrics were stored in a map keyed by function name only.
+// Multiple functions in the same file sharing a name (e.g. methods named
+// `Read` on different receivers in Go) were silently overwriting each other,
+// and `total_functions` was reported as `len(map)` rather than the actual
+// number of declared functions.
+func TestVisitor_CountsAllSameNameFunctions(t *testing.T) {
+	t.Parallel()
+
+	const (
+		sharedName = "Read"
+		dupCount   = 5
+	)
+
+	root := &node.Node{Type: node.UASTFile}
+
+	for range dupCount {
+		fn := &node.Node{Type: node.UASTFunction}
+		fn.Roles = []node.Role{node.RoleFunction, node.RoleDeclaration}
+
+		nameNode := node.NewNodeWithToken(node.UASTIdentifier, sharedName)
+		nameNode.Roles = []node.Role{node.RoleName}
+		fn.AddChild(nameNode)
+
+		root.AddChild(fn)
+	}
+
+	visitor := NewVisitor()
+	traverser := analyze.NewMultiAnalyzerTraverser()
+	traverser.RegisterVisitor(visitor)
+	traverser.Traverse(root)
+
+	assert.Lenf(t, visitor.functionMetrics, dupCount,
+		"visitor must record one entry per function declaration, not dedup by name")
+
+	report := visitor.GetReport()
+
+	totalFunctions, ok := report["total_functions"].(int)
+	require.True(t, ok, "total_functions must be present and int-typed")
+	assert.Equalf(t, dupCount, totalFunctions,
+		"reported total_functions must match declarations, not unique names")
+
+	items, ok := analyze.ReportFunctionList(report, "functions")
+	require.True(t, ok, "functions collection must be readable")
+	assert.Lenf(t, items, dupCount,
+		"detailed function items must include every declaration, not dedup by name")
+}
diff --git a/internal/analyzers/imports/aggregator.go b/internal/analyzers/imports/aggregator.go
index 8b398b9..b4760a9 100644
--- a/internal/analyzers/imports/aggregator.go
+++ b/internal/analyzers/imports/aggregator.go
@@ -3,10 +3,13 @@ package imports
 
 import (
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common"
 )
 
 // Aggregator aggregates import analysis results across multiple files.
 type Aggregator struct {
+	common.PerFileRetainer
+
 	allImports map[string]int // Import path -> count.
 	totalFiles int
 }
@@ -21,6 +24,7 @@ func NewAggregator() *Aggregator {
 // Aggregate combines results from multiple files.
 func (a *Aggregator) Aggregate(results map[string]analyze.Report) {
 	for _, report := range results {
+		a.Retain(report)
 		a.totalFiles++
 
 		if imports, ok := report["imports"].([]string); ok {
diff --git a/internal/analyzers/imports/report_section.go b/internal/analyzers/imports/report_section.go
index e37b039..6383907 100644
--- a/internal/analyzers/imports/report_section.go
+++ b/internal/analyzers/imports/report_section.go
@@ -89,10 +89,13 @@ func (s *ReportSection) AllIssues() []analyze.Issue {
 }
 
 // importIssues builds import issues sorted by frequency (or name), limited to limit (0 = all).
+// When the report contains a _source_file key, it is used as the Location for each issue.
 func (s *ReportSection) importIssues(limit int) []analyze.Issue {
+	location := reportutil.GetString(s.report, analyze.SourceFileKey)
+
 	counts := reportutil.GetStringIntMap(s.report, KeyImportCounts)
 	if len(counts) > 0 {
-		return buildIssuesFromCounts(counts, limit)
+		return buildIssuesFromCounts(counts, limit, location)
 	}
 
 	// Fallback: use simple imports list.
@@ -101,11 +104,11 @@ func (s *ReportSection) importIssues(limit int) []analyze.Issue {
 		return nil
 	}
 
-	return buildIssuesFromList(imports, limit)
+	return buildIssuesFromList(imports, limit, location)
 }
 
 // buildIssuesFromCounts creates sorted issues from import_counts map.
-func buildIssuesFromCounts(counts map[string]int, limit int) []analyze.Issue {
+func buildIssuesFromCounts(counts map[string]int, limit int, location string) []analyze.Issue {
 	entries := make([]importEntry, 0, len(counts))
 	for name, count := range counts {
 		entries = append(entries, importEntry{name: name, count: count})
@@ -118,6 +121,7 @@ func buildIssuesFromCounts(counts map[string]int, limit int) []analyze.Issue {
 	for _, e := range sorted {
 		issues = append(issues, analyze.Issue{
 			Name:     e.name,
+			Location: location,
 			Value:    reportutil.FormatInt(e.count),
 			Severity: analyze.SeverityInfo,
 		})
@@ -127,13 +131,14 @@ func buildIssuesFromCounts(counts map[string]int, limit int) []analyze.Issue {
 }
 
 // buildIssuesFromList creates issues from a simple string slice sorted alphabetically.
-func buildIssuesFromList(imports []string, limit int) []analyze.Issue {
+func buildIssuesFromList(imports []string, limit int, location string) []analyze.Issue {
 	sorted := mapx.SortAndLimit(imports, importNameLess, limit)
 
 	issues := make([]analyze.Issue, 0, len(sorted))
 	for _, imp := range sorted {
 		issues = append(issues, analyze.Issue{
 			Name:     imp,
+			Location: location,
 			Value:    "1",
 			Severity: analyze.SeverityInfo,
 		})
diff --git a/internal/analyzers/imports/report_section_test.go b/internal/analyzers/imports/report_section_test.go
index 4d44281..bd32214 100644
--- a/internal/analyzers/imports/report_section_test.go
+++ b/internal/analyzers/imports/report_section_test.go
@@ -193,3 +193,46 @@ func TestImportsImplementsInterface(t *testing.T) {
 
 	var _ analyze.ReportSection = (*ReportSection)(nil)
 }
+
+func TestImportsPerFile_IssuesHaveLocation(t *testing.T) {
+	t.Parallel()
+
+	report := analyze.Report{
+		"imports":             []string{"fmt", "os"},
+		"count":               2,
+		"import_counts":       map[string]int{"fmt": 1, "os": 1},
+		analyze.SourceFileKey: "/repo/pkg/foo.go",
+	}
+
+	section := NewReportSection(report)
+	issues := section.AllIssues()
+
+	if len(issues) == 0 {
+		t.Fatal("expected issues for imports")
+	}
+
+	for _, issue := range issues {
+		if issue.Location != "/repo/pkg/foo.go" {
+			t.Errorf("issue %q location = %q, want %q",
+				issue.Name, issue.Location, "/repo/pkg/foo.go")
+		}
+	}
+}
+
+func TestImportsPerFile_NoSourceFile_EmptyLocation(t *testing.T) {
+	t.Parallel()
+
+	report := newTestImportsReport()
+	section := NewReportSection(report)
+	issues := section.AllIssues()
+
+	if len(issues) == 0 {
+		t.Fatal("expected issues")
+	}
+
+	for _, issue := range issues {
+		if issue.Location != "" {
+			t.Errorf("issue %q location = %q, want empty", issue.Name, issue.Location)
+		}
+	}
+}
diff --git a/internal/analyzers/imports/store_writer_test.go b/internal/analyzers/imports/store_writer_test.go
index 2502b23..ce18107 100644
--- a/internal/analyzers/imports/store_writer_test.go
+++ b/internal/analyzers/imports/store_writer_test.go
@@ -1,7 +1,5 @@
 package imports
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"sort"
diff --git a/internal/analyzers/plumbing/identity.go b/internal/analyzers/plumbing/identity.go
index e516c5e..c26a621 100644
--- a/internal/analyzers/plumbing/identity.go
+++ b/internal/analyzers/plumbing/identity.go
@@ -23,6 +23,7 @@ type IdentityDetector struct {
 	ReversedPeopleDict []string
 	AuthorID           int
 	ExactSignatures    bool
+
 	// incrementalEmails and incrementalNames are used when building the dict incrementally
 	// during Consume() when commits aren't available during Configure().
 	incrementalEmails map[int][]string
diff --git a/internal/analyzers/plumbing/langpath/langpath.go b/internal/analyzers/plumbing/langpath/langpath.go
new file mode 100644
index 0000000..eaf623a
--- /dev/null
+++ b/internal/analyzers/plumbing/langpath/langpath.go
@@ -0,0 +1,82 @@
+// Package langpath converts user-supplied language tokens into
+// deterministic pathspec globs backed by enry's Linguist data.
+package langpath
+
+import (
+	"errors"
+	"fmt"
+	"slices"
+	"strings"
+
+	"github.com/src-d/enry/v2"
+	"github.com/src-d/enry/v2/data"
+)
+
+// ErrUnknownLanguage is returned when a user-supplied token does not
+// resolve to any Linguist language (including its aliases).
+var ErrUnknownLanguage = errors.New("unknown language")
+
+// filenamesByLanguage inverts enry.data.LanguagesByFilename so we can
+// look up "languages → []filename" at Globs time. Built once at
+// package load; read-only thereafter.
+var filenamesByLanguage = invertLanguagesByFilename()
+
+func invertLanguagesByFilename() map[string][]string {
+	out := make(map[string][]string)
+
+	for filename, langs := range data.LanguagesByFilename {
+		for _, lang := range langs {
+			out[lang] = append(out[lang], filename)
+		}
+	}
+
+	return out
+}
+
+const (
+	// allToken is the sentinel meaning "do not restrict by language".
+	allToken = "all"
+	// extensionGlobPrefix is prepended to every extension-derived glob.
+	extensionGlobPrefix = "*"
+)
+
+// Globs converts a list of user-supplied language tokens into a
+// sorted, deduplicated set of pathspec globs. wantsAll is true when
+// the caller did not restrict languages (empty input or the literal
+// "all" token). Callers should skip path-spec push-down in that case.
+func Globs(langs []string) (globs []string, wantsAll bool, err error) {
+	if len(langs) == 0 {
+		return nil, true, nil
+	}
+
+	set := make(map[string]struct{})
+
+	for _, raw := range langs {
+		token := strings.TrimSpace(raw)
+		if strings.EqualFold(token, allToken) {
+			return nil, true, nil
+		}
+
+		canonical, ok := enry.GetLanguageByAlias(token)
+		if !ok {
+			return nil, false, fmt.Errorf("%w: %q", ErrUnknownLanguage, raw)
+		}
+
+		for _, ext := range enry.GetLanguageExtensions(canonical) {
+			set[extensionGlobPrefix+ext] = struct{}{}
+		}
+
+		for _, name := range filenamesByLanguage[canonical] {
+			set[name] = struct{}{}
+		}
+	}
+
+	out := make([]string, 0, len(set))
+	for g := range set {
+		out = append(out, g)
+	}
+
+	slices.Sort(out)
+
+	return out, false, nil
+}
diff --git a/internal/analyzers/plumbing/langpath/langpath_test.go b/internal/analyzers/plumbing/langpath/langpath_test.go
new file mode 100644
index 0000000..6f63b63
--- /dev/null
+++ b/internal/analyzers/plumbing/langpath/langpath_test.go
@@ -0,0 +1,147 @@
+package langpath_test
+
+import (
+	"slices"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/langpath"
+)
+
+func TestGlobs_AllToken_YieldsWantsAll(t *testing.T) {
+	t.Parallel()
+
+	globs, wantsAll, err := langpath.Globs([]string{"all"})
+
+	require.NoError(t, err)
+	assert.True(t, wantsAll, "all token must set wantsAll")
+	assert.Nil(t, globs, "wantsAll must return nil globs")
+}
+
+func TestGlobs_ReturnsFreshSlicePerCall(t *testing.T) {
+	t.Parallel()
+
+	a, _, errA := langpath.Globs([]string{"go"})
+	require.NoError(t, errA)
+	require.NotEmpty(t, a)
+
+	b, _, errB := langpath.Globs([]string{"go"})
+	require.NoError(t, errB)
+	require.NotEmpty(t, b)
+
+	const tampered = "tampered"
+
+	a[0] = tampered
+	assert.NotEqual(t, tampered, b[0],
+		"mutating one call's result must not affect a subsequent call's result")
+}
+
+func TestGlobs_Dockerfile_IncludesBasenameGlob(t *testing.T) {
+	t.Parallel()
+
+	globs, wantsAll, err := langpath.Globs([]string{"dockerfile"})
+
+	require.NoError(t, err)
+	assert.False(t, wantsAll)
+	assert.Contains(t, globs, "Dockerfile",
+		"filename-only languages must emit a literal-filename glob")
+}
+
+func TestGlobs_MultipleLanguages_SortedAndDeduplicated(t *testing.T) {
+	t.Parallel()
+
+	globs, wantsAll, err := langpath.Globs([]string{"python", "go", "python"})
+
+	require.NoError(t, err)
+	assert.False(t, wantsAll)
+	assert.NotEmpty(t, globs)
+	assert.True(t, slices.IsSorted(globs), "globs must be sorted")
+	assert.Contains(t, globs, "*.go", "go extension must be present")
+	assert.Contains(t, globs, "*.py", "python extension must be present")
+	assert.Len(t, mapset(globs), len(globs), "globs must be deduplicated")
+}
+
+func mapset(xs []string) map[string]struct{} {
+	m := make(map[string]struct{}, len(xs))
+	for _, x := range xs {
+		m[x] = struct{}{}
+	}
+
+	return m
+}
+
+func TestGlobs_UnknownToken_ReturnsErrUnknownLanguage(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name string
+		in   []string
+	}{
+		{"solo", []string{"notalang"}},
+		{"after known", []string{"go", "notalang"}},
+		{"before known", []string{"notalang", "go"}},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			globs, wantsAll, err := langpath.Globs(tt.in)
+
+			require.ErrorIs(t, err, langpath.ErrUnknownLanguage)
+			assert.False(t, wantsAll)
+			assert.Nil(t, globs)
+			assert.Contains(t, err.Error(), "notalang")
+		})
+	}
+}
+
+func TestGlobs_GoToken_YieldsStarDotGo(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name string
+		in   string
+	}{
+		{"lowercase", "go"},
+		{"titlecase", "Go"},
+		{"uppercase", "GO"},
+		{"padded", "  go  "},
+		{"alias golang", "golang"},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			globs, wantsAll, err := langpath.Globs([]string{tt.in})
+
+			require.NoError(t, err)
+			assert.False(t, wantsAll)
+			assert.Equal(t, []string{"*.go"}, globs)
+		})
+	}
+}
+
+func TestGlobs_EmptyInput_YieldsWantsAll(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name string
+		in   []string
+	}{
+		{"nil slice", nil},
+		{"empty slice", []string{}},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			globs, wantsAll, err := langpath.Globs(tt.in)
+
+			require.NoError(t, err)
+			assert.True(t, wantsAll)
+			assert.Nil(t, globs)
+		})
+	}
+}
diff --git a/internal/analyzers/plumbing/pathpolicy/pathpolicy.go b/internal/analyzers/plumbing/pathpolicy/pathpolicy.go
new file mode 100644
index 0000000..0cfdb88
--- /dev/null
+++ b/internal/analyzers/plumbing/pathpolicy/pathpolicy.go
@@ -0,0 +1,65 @@
+// Package pathpolicy decides whether a file path should be excluded
+// from analysis based on user-visible options that mirror the CLI
+// flags (--include-vendored, --include-generated,
+// --extra-excluded-prefixes). Pure, stateless, cross-phase.
+package pathpolicy
+
+import (
+	"strings"
+
+	"github.com/src-d/enry/v2"
+
+	"github.com/Sumatoshi-tech/codefang/pkg/pathfilter"
+)
+
+// defaultFilter carries the built-in generated-file heuristics
+// (filename suffixes, prefixes, and content markers) as they ship in
+// pkg/pathfilter. Reusing one immutable instance keeps allocation
+// off the hot path.
+var defaultFilter = pathfilter.New()
+
+// Options captures the user-visible configuration.
+// The zero value excludes vendor, generated, and nothing else.
+type Options struct {
+	IncludeVendored       bool
+	IncludeGenerated      bool
+	ExtraExcludedPrefixes []string
+}
+
+// Exclude reports whether the given path should be skipped.
+// content may be nil; when provided, content-based heuristics may
+// refine the generated-file classification.
+func Exclude(path string, content []byte, opts Options) bool {
+	switch {
+	case matchesAnyPrefix(path, opts.ExtraExcludedPrefixes):
+		return true
+	case !opts.IncludeVendored && enry.IsVendor(path):
+		return true
+	case !opts.IncludeGenerated && isGenerated(path, content):
+		return true
+	}
+
+	return false
+}
+
+// matchesAnyPrefix returns true if path begins with any non-empty
+// entry of prefixes.
+func matchesAnyPrefix(path string, prefixes []string) bool {
+	for _, prefix := range prefixes {
+		if prefix != "" && strings.HasPrefix(path, prefix) {
+			return true
+		}
+	}
+
+	return false
+}
+
+// isGenerated returns true if the path or header content identifies
+// the file as machine-generated per the built-in heuristics.
+func isGenerated(path string, content []byte) bool {
+	if defaultFilter.IsGeneratedPath(path) {
+		return true
+	}
+
+	return len(content) > 0 && defaultFilter.IsGeneratedContent(content)
+}
diff --git a/internal/analyzers/plumbing/pathpolicy/pathpolicy_test.go b/internal/analyzers/plumbing/pathpolicy/pathpolicy_test.go
new file mode 100644
index 0000000..4cff589
--- /dev/null
+++ b/internal/analyzers/plumbing/pathpolicy/pathpolicy_test.go
@@ -0,0 +1,134 @@
+package pathpolicy_test
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
+)
+
+func TestExclude_PlainPath_Included(t *testing.T) {
+	t.Parallel()
+
+	got := pathpolicy.Exclude("pkg/foo/bar.go", nil, pathpolicy.Options{})
+
+	assert.False(t, got,
+		"a non-vendor non-generated path must not be excluded under default options")
+}
+
+func TestExclude_VendorPath_ExcludedByDefault(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name string
+		path string
+	}{
+		{"go vendor", "vendor/github.com/pkg/errors/errors.go"},
+		{"node_modules", "node_modules/left-pad/index.js"},
+		{"third-party", "third_party/boringssl/src.c"},
+		{"testdata", "pkg/foo/testdata/sample.json"},
+		{"minified js", "static/jquery.min.js"},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			got := pathpolicy.Exclude(tt.path, nil, pathpolicy.Options{})
+			assert.True(t, got,
+				"Linguist-vendored path must be excluded under default options: "+tt.path)
+		})
+	}
+}
+
+func TestExclude_GeneratedPath_ExcludedByDefault(t *testing.T) {
+	t.Parallel()
+
+	tests := []struct {
+		name string
+		path string
+	}{
+		{"go protobuf", "pkg/api/foo.pb.go"},
+		{"k8s zz_generated", "pkg/apis/core/v1/zz_generated_deepcopy.go"},
+		{"python protobuf", "pkg/api/foo_pb2.py"},
+		{"mockgen", "mocks/mock_service.go"},
+	}
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			t.Parallel()
+
+			got := pathpolicy.Exclude(tt.path, nil, pathpolicy.Options{})
+			assert.True(t, got,
+				"generated-looking path must be excluded under default options: "+tt.path)
+		})
+	}
+}
+
+func TestExclude_ExtraExcludedPrefixes_ExcludesMatches(t *testing.T) {
+	t.Parallel()
+
+	opts := pathpolicy.Options{
+		ExtraExcludedPrefixes: []string{".venv/", "docs/"},
+	}
+
+	assert.True(t, pathpolicy.Exclude(".venv/lib/foo.py", nil, opts),
+		".venv/ prefix must exclude python virtualenv content")
+	assert.True(t, pathpolicy.Exclude("docs/README.md", nil, opts),
+		"docs/ prefix must exclude documentation")
+	assert.False(t, pathpolicy.Exclude("pkg/foo.go", nil, opts),
+		"a non-matching path must not be excluded")
+}
+
+func TestExclude_ExtraExcludedPrefixes_BypassIncludeOverrides(t *testing.T) {
+	t.Parallel()
+
+	opts := pathpolicy.Options{
+		IncludeVendored:       true,
+		IncludeGenerated:      true,
+		ExtraExcludedPrefixes: []string{"vendor/"},
+	}
+
+	assert.True(t, pathpolicy.Exclude("vendor/foo.go", nil, opts),
+		"ExtraExcludedPrefixes must still apply even when include flags are set")
+}
+
+func TestExclude_GeneratedContentMarker_ExcludedByDefault(t *testing.T) {
+	t.Parallel()
+
+	content := []byte("// Code generated by protoc-gen-go. DO NOT EDIT.\npackage foo\n")
+
+	got := pathpolicy.Exclude("pkg/foo/ordinary.go", content, pathpolicy.Options{})
+	assert.True(t, got,
+		"content starting with a generated-file marker must be excluded under default options")
+}
+
+func TestExclude_IncludeGenerated_KeepsContentMarker(t *testing.T) {
+	t.Parallel()
+
+	content := []byte("// Code generated by protoc-gen-go. DO NOT EDIT.\npackage foo\n")
+	opts := pathpolicy.Options{IncludeGenerated: true}
+
+	got := pathpolicy.Exclude("pkg/foo/ordinary.go", content, opts)
+	assert.False(t, got,
+		"IncludeGenerated=true must keep a generated-content file in analysis")
+}
+
+func TestExclude_IncludeGenerated_KeepsGenerated(t *testing.T) {
+	t.Parallel()
+
+	opts := pathpolicy.Options{IncludeGenerated: true}
+
+	got := pathpolicy.Exclude("pkg/api/foo.pb.go", nil, opts)
+	assert.False(t, got,
+		"IncludeGenerated=true must keep generated paths in analysis")
+}
+
+func TestExclude_IncludeVendored_KeepsVendor(t *testing.T) {
+	t.Parallel()
+
+	opts := pathpolicy.Options{IncludeVendored: true}
+
+	got := pathpolicy.Exclude("vendor/github.com/pkg/errors/errors.go", nil, opts)
+	assert.False(t, got,
+		"IncludeVendored=true must keep vendor paths in analysis")
+}
diff --git a/internal/analyzers/plumbing/plumbing_test.go b/internal/analyzers/plumbing/plumbing_test.go
index c91d5fd..e48f906 100644
--- a/internal/analyzers/plumbing/plumbing_test.go
+++ b/internal/analyzers/plumbing/plumbing_test.go
@@ -6,6 +6,7 @@ import (
 
 	"github.com/stretchr/testify/require"
 
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 )
 
@@ -26,6 +27,56 @@ func TestTreeDiffAnalyzer_Configure(t *testing.T) {
 	require.NoError(t, err)
 }
 
+func TestTreeDiffAnalyzer_Configure_BuildsPathspecFromLanguages(t *testing.T) {
+	t.Parallel()
+
+	td := &TreeDiffAnalyzer{}
+	err := td.Configure(map[string]any{
+		ConfigTreeDiffLanguages: []string{"go"},
+	})
+
+	require.NoError(t, err)
+	require.NotEmpty(t, td.Pathspec, "pathspec must be built from --languages")
+	require.Contains(t, td.Pathspec, "*.go")
+}
+
+func TestTreeDiffAnalyzer_Configure_AllLanguagesGivesEmptyPathspec(t *testing.T) {
+	t.Parallel()
+
+	td := &TreeDiffAnalyzer{}
+	err := td.Configure(map[string]any{
+		ConfigTreeDiffLanguages: []string{"all"},
+	})
+
+	require.NoError(t, err)
+	require.Empty(t, td.Pathspec,
+		"all languages must skip path-spec push-down (empty pathspec)")
+}
+
+func TestTreeDiffAnalyzer_Configure_AliasResolvesToCanonicalInLanguagesSet(t *testing.T) {
+	t.Parallel()
+
+	td := &TreeDiffAnalyzer{}
+	err := td.Configure(map[string]any{
+		ConfigTreeDiffLanguages: []string{"golang"},
+	})
+
+	require.NoError(t, err)
+	require.True(t, td.Languages["go"],
+		"alias 'golang' must resolve so the Go-side filter recognizes canonical lowercase 'go'")
+}
+
+func TestTreeDiffAnalyzer_Configure_UnknownLanguageReturnsError(t *testing.T) {
+	t.Parallel()
+
+	td := &TreeDiffAnalyzer{}
+	err := td.Configure(map[string]any{
+		ConfigTreeDiffLanguages: []string{"notalang"},
+	})
+
+	require.Error(t, err, "unknown language must surface at Configure time")
+}
+
 func TestTreeDiffAnalyzer_Initialize(t *testing.T) {
 	t.Parallel()
 
@@ -128,6 +179,44 @@ func TestChangeEntry_Hash(t *testing.T) {
 	}
 }
 
+func TestTreeDiff_filterChanges_DefaultPolicyDropsVendor(t *testing.T) {
+	t.Parallel()
+
+	hash := gitlib.NewHash("1111111111111111111111111111111111111111")
+	td := &TreeDiffAnalyzer{
+		Languages: map[string]bool{allLanguages: true},
+	}
+
+	changes := gitlib.Changes{
+		{Action: gitlib.Modify, To: gitlib.ChangeEntry{Name: "vendor/foo.go", Hash: hash}},
+		{Action: gitlib.Modify, To: gitlib.ChangeEntry{Name: "pkg/bar.go", Hash: hash}},
+	}
+
+	filtered := td.filterChanges(context.Background(), changes)
+	require.Len(t, filtered, 1)
+	require.Equal(t, "pkg/bar.go", filtered[0].To.Name,
+		"default TreeDiffAnalyzer (zero PathPolicy) must drop vendor paths")
+}
+
+func TestTreeDiff_filterChanges_IncludeVendoredKeepsVendor(t *testing.T) {
+	t.Parallel()
+
+	hash := gitlib.NewHash("1111111111111111111111111111111111111111")
+	td := &TreeDiffAnalyzer{
+		Languages:  map[string]bool{allLanguages: true},
+		PathPolicy: pathpolicy.Options{IncludeVendored: true, IncludeGenerated: true},
+	}
+
+	changes := gitlib.Changes{
+		{Action: gitlib.Modify, To: gitlib.ChangeEntry{Name: "vendor/foo.go", Hash: hash}},
+		{Action: gitlib.Modify, To: gitlib.ChangeEntry{Name: "pkg/bar.go", Hash: hash}},
+	}
+
+	filtered := td.filterChanges(context.Background(), changes)
+	require.Len(t, filtered, 2,
+		"IncludeVendored=true must keep vendor changes in the filtered set")
+}
+
 // TestTreeDiff_filterChanges_prefixBlacklist verifies blacklist uses path prefix match only.
 func TestTreeDiff_filterChanges_prefixBlacklist(t *testing.T) {
 	t.Parallel()
diff --git a/internal/analyzers/plumbing/ticks.go b/internal/analyzers/plumbing/ticks.go
index 4769055..9115ab0 100644
--- a/internal/analyzers/plumbing/ticks.go
+++ b/internal/analyzers/plumbing/ticks.go
@@ -15,12 +15,15 @@ import (
 
 // TicksSinceStart computes relative time ticks for each commit since the start.
 type TicksSinceStart struct {
-	tick0        *time.Time
-	commits      map[int][]gitlib.Hash
-	remote       string
-	TickSize     time.Duration
-	previousTick int
-	Tick         int
+	tick0         *time.Time
+	commits       map[int][]gitlib.Hash
+	remote        string
+	TickSize      time.Duration
+	previousTick  int
+	Tick          int
+	lastValidWhen time.Time           // Most recent in-window committer timestamp; substitution source.
+	tick0Set      bool                // tick0 has been seeded by an in-window commit.
+	anomalies     *timeAnomalyTracker // Shared across Fork() clones so aggregated counts survive forking.
 }
 
 const (
@@ -90,6 +93,12 @@ func (t *TicksSinceStart) Initialize(_ *gitlib.Repository) error {
 	}
 
 	t.tick0 = &time.Time{}
+	t.tick0Set = false
+	t.lastValidWhen = time.Time{}
+
+	if t.anomalies == nil {
+		t.anomalies = &timeAnomalyTracker{}
+	}
 
 	t.previousTick = 0
 	if t.commits == nil || len(t.commits) > 0 {
@@ -104,14 +113,14 @@ func (t *TicksSinceStart) Initialize(_ *gitlib.Repository) error {
 // Consume processes a single commit with the provided dependency results.
 func (t *TicksSinceStart) Consume(_ context.Context, ac *analyze.Context) (analyze.TC, error) {
 	commit := ac.Commit
-	index := ac.Index
+	when := t.sanitizeWhen(commit.Committer().When)
 
-	if index == 0 {
-		tick0 := commit.Committer().When
-		*t.tick0 = FloorTime(tick0, t.TickSize)
+	if !t.tick0Set {
+		*t.tick0 = FloorTime(when, t.TickSize)
+		t.tick0Set = true
 	}
 
-	tick := max(int(commit.Committer().When.Sub(*t.tick0)/t.TickSize), t.previousTick)
+	tick := max(int(when.Sub(*t.tick0)/t.TickSize), t.previousTick)
 
 	t.previousTick = tick
 
@@ -142,6 +151,58 @@ func (t *TicksSinceStart) Consume(_ context.Context, ac *analyze.Context) (analy
 	return analyze.TC{}, nil
 }
 
+// sanitizeWhen clamps a committer timestamp into the sane analysis window
+// [minSaneCommitTime, [time.Now]()+maxClockSkew]. Out-of-window values are
+// substituted with the most recent in-window timestamp seen, falling back
+// to minSaneCommitTime on the first commit. Each substitution is counted
+// and surfaced via TimeAnomalies(); the warning log is rate-limited.
+//
+// In-window inputs pass through unchanged and update lastValidWhen so
+// future anomalies have a fresh substitution source.
+func (t *TicksSinceStart) sanitizeWhen(when time.Time) time.Time {
+	upperBound := time.Now().Add(maxClockSkew)
+
+	switch {
+	case when.Before(minSaneCommitTime):
+		replacement := t.substituteWhen()
+		t.anomalies.recordBeforeMin(when, replacement)
+
+		return replacement
+	case when.After(upperBound):
+		replacement := t.substituteWhen()
+		t.anomalies.recordAfterMax(when, replacement)
+
+		return replacement
+	}
+
+	t.lastValidWhen = when
+
+	return when
+}
+
+// substituteWhen picks a stand-in for an out-of-window committer time:
+// the most recent in-window value if we have one, otherwise the
+// minSaneCommitTime floor (so the bad commit collapses to tick 0 instead
+// of inflating the analysis period).
+func (t *TicksSinceStart) substituteWhen() time.Time {
+	if t.lastValidWhen.IsZero() {
+		return minSaneCommitTime
+	}
+
+	return t.lastValidWhen
+}
+
+// TimeAnomalies returns the cumulative count of committer-timestamp
+// anomalies clamped during this analyzer's run. See [TimeAnomalyStats]
+// for the operational meaning.
+func (t *TicksSinceStart) TimeAnomalies() TimeAnomalyStats {
+	if t.anomalies == nil {
+		return TimeAnomalyStats{}
+	}
+
+	return t.anomalies.snapshot()
+}
+
 // FloorTime rounds a timestamp down to the nearest tick boundary.
 func FloorTime(t time.Time, d time.Duration) time.Time {
 	result := t.Round(d)
diff --git a/internal/analyzers/plumbing/ticks_anomaly.go b/internal/analyzers/plumbing/ticks_anomaly.go
new file mode 100644
index 0000000..ca1e1be
--- /dev/null
+++ b/internal/analyzers/plumbing/ticks_anomaly.go
@@ -0,0 +1,123 @@
+package plumbing
+
+import (
+	"log"
+	"sync/atomic"
+	"time"
+)
+
+// minSaneCommitTime is the lower bound for a plausible committer timestamp.
+// Git itself first shipped in 2005; commits stamped before 1990-01-01 are
+// almost certainly the result of a corrupt commit object, an unset system
+// clock (epoch 0 → 1970), or a deliberate `GIT_COMMITTER_DATE=` override.
+//
+// Without this clamp a single such commit pegged tick0 to ~1970, after
+// which every modern commit's Sub(tick0) overflowed the int64-nanosecond
+// [time.Duration] and clamped to ~292 years. That clamp leaked into burndown
+// as a 106 740-day "analysis period". See ticks.go: the bug was sticky via
+// max(tick, previousTick).
+var minSaneCommitTime = time.Date(1990, time.January, 1, 0, 0, 0, 0, time.UTC)
+
+// maxClockSkew is the upper-bound grace allowed past wall-clock time. A
+// committer timestamp more than this far in the future is treated as
+// anomalous regardless of repo content.
+const maxClockSkew = 24 * time.Hour
+
+// anomalyLogIntervalNanos throttles the per-event "anomalous committer
+// timestamp" log line so a repo with thousands of bad commits doesn't
+// drown the operator-facing log. Same shape as
+// burndown/mismatch_tracker's log throttle.
+const anomalyLogIntervalNanos = int64(time.Second)
+
+// timeAnomalyTracker counts committer-timestamp anomalies detected during
+// tick computation and rate-limits the warning log. Atomics make the
+// tracker safe to call from the per-shard clones returned by Fork(); the
+// sequential plumbing analyzer never actually races, but using atomics
+// keeps Fork() safe by construction.
+type timeAnomalyTracker struct {
+	beforeMin    atomic.Int64 // Counter: timestamps before minSaneCommitTime.
+	afterMax     atomic.Int64 // Counter: timestamps too far in the future.
+	dropped      atomic.Int64 // Suppressed since last emitted log line.
+	lastLogNanos atomic.Int64 // Monotonic-ish slot timestamp.
+}
+
+// recordBeforeMin bumps the before-min counter and emits a rate-limited
+// warning. when is the bogus committer time we observed, replacement is
+// the time we substituted into tick math.
+func (t *timeAnomalyTracker) recordBeforeMin(when, replacement time.Time) {
+	t.beforeMin.Add(1)
+	t.maybeLog("before-min", when, replacement)
+}
+
+// recordAfterMax bumps the after-max counter and emits a rate-limited
+// warning. Mirrors recordBeforeMin for the future-clamp side.
+func (t *timeAnomalyTracker) recordAfterMax(when, replacement time.Time) {
+	t.afterMax.Add(1)
+	t.maybeLog("after-max", when, replacement)
+}
+
+// maybeLog emits one warning per anomalyLogIntervalNanos at most. Mirrors
+// burndown.mismatchTracker.maybeLog: try to claim the slot via CAS; on
+// failure (slot still warm), bump dropped and return silently. On success,
+// flush the dropped tail in the emitted line.
+func (t *timeAnomalyTracker) maybeLog(kind string, when, replacement time.Time) {
+	now := time.Now().UnixNano()
+	last := t.lastLogNanos.Load()
+
+	if now-last < anomalyLogIntervalNanos {
+		t.dropped.Add(1)
+
+		return
+	}
+
+	if !t.lastLogNanos.CompareAndSwap(last, now) {
+		t.dropped.Add(1)
+
+		return
+	}
+
+	dropped := t.dropped.Swap(0)
+	if dropped == 0 {
+		log.Printf("ticks: %s anomalous committer timestamp %s, substituted %s",
+			kind, when.Format(time.RFC3339), replacement.Format(time.RFC3339))
+
+		return
+	}
+
+	log.Printf("ticks: %s anomalous committer timestamp %s, substituted %s [dropped=%d since last]",
+		kind, when.Format(time.RFC3339), replacement.Format(time.RFC3339), dropped)
+}
+
+// snapshot returns the running counts. Used by accessor TimeAnomalies()
+// for tests and external observers.
+func (t *timeAnomalyTracker) snapshot() TimeAnomalyStats {
+	return TimeAnomalyStats{
+		BeforeMin: t.beforeMin.Load(),
+		AfterMax:  t.afterMax.Load(),
+	}
+}
+
+// TimeAnomalyStats reports anomalous committer-timestamp detections.
+//
+// BeforeMin counts commits whose committer time was earlier than the
+// hard-coded floor (1990-01-01 UTC) — typically epoch-0 (1970) values
+// from corrupt commit objects, unset system clocks, or deliberate
+// GIT_COMMITTER_DATE overrides.
+//
+// AfterMax counts commits whose committer time was more than 24h past
+// the analyzer's wall-clock — typically forged future timestamps
+// ("--date=2099-01-01") or clock skew at commit time.
+//
+// In both cases the substituted time is the previous valid committer
+// timestamp (or 1990-01-01 UTC if no valid commit has been seen yet),
+// so the bad commit collapses onto the timeline at a sensible point
+// instead of overflowing the int64-nanosecond Duration in ticks.go.
+type TimeAnomalyStats struct {
+	BeforeMin int64
+	AfterMax  int64
+}
+
+// Total returns the combined count of anomalies on both bounds.
+func (s TimeAnomalyStats) Total() int64 {
+	return s.BeforeMin + s.AfterMax
+}
diff --git a/internal/analyzers/plumbing/ticks_anomaly_test.go b/internal/analyzers/plumbing/ticks_anomaly_test.go
new file mode 100644
index 0000000..c3b7472
--- /dev/null
+++ b/internal/analyzers/plumbing/ticks_anomaly_test.go
@@ -0,0 +1,271 @@
+package plumbing
+
+import (
+	"context"
+	"sync"
+	"testing"
+	"time"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
+)
+
+func newTicks(t *testing.T) *TicksSinceStart {
+	t.Helper()
+
+	ts := &TicksSinceStart{}
+
+	err := ts.Initialize(nil)
+	if err != nil {
+		t.Fatalf("Initialize: %v", err)
+	}
+
+	return ts
+}
+
+// makeCommit builds a minimal gitlib.TestCommit suitable for driving
+// TicksSinceStart.Consume — only Hash, Committer, and NumParents are read
+// by the tick path.
+func makeCommit(when time.Time, hashByte byte) *gitlib.TestCommit {
+	parent := gitlib.Hash{}
+	parent[0] = hashByte // Any non-zero parent makes NumParents() > 0.
+	commit := gitlib.NewTestCommit(
+		gitlib.Hash{hashByte},
+		gitlib.Signature{Name: "T", Email: "t@t", When: when},
+		"msg",
+		parent,
+	)
+
+	return commit
+}
+
+func consume(t *testing.T, ts *TicksSinceStart, when time.Time, index int) int {
+	t.Helper()
+
+	commit := makeCommit(when, byte(index+1))
+
+	_, err := ts.Consume(context.Background(), &analyze.Context{
+		Commit: commit,
+		Index:  index,
+	})
+	if err != nil {
+		t.Fatalf("Consume: %v", err)
+	}
+
+	return ts.Tick
+}
+
+func TestSanitizeWhen_BeforeMin_FirstCommit_FallsBackToMinSaneTime(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	// First commit at unix epoch (1970) — the canonical "epoch-zero
+	// committer" failure mode that previously pegged tick0 to 1970 and
+	// produced a 106 740-day analysis period.
+	got := consume(t, ts, time.Unix(0, 0), 0)
+	if got != 0 {
+		t.Errorf("first-commit tick = %d, want 0 (anomaly must collapse to start)", got)
+	}
+
+	if stats := ts.TimeAnomalies(); stats.BeforeMin != 1 {
+		t.Errorf("BeforeMin = %d, want 1", stats.BeforeMin)
+	}
+
+	if !ts.tick0Set {
+		t.Error("tick0Set must be true after first consume even on anomaly")
+	}
+
+	if !ts.tick0.Equal(FloorTime(minSaneCommitTime, ts.TickSize)) {
+		t.Errorf("tick0 = %s, want floor(%s) (must seed from sanitized substitute)",
+			ts.tick0.Format(time.RFC3339), minSaneCommitTime.Format(time.RFC3339))
+	}
+}
+
+func TestSanitizeWhen_BeforeMin_AfterValidCommit_UsesLastValid(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	// Seed with a normal commit to populate lastValidWhen.
+	good := time.Date(2024, time.April, 1, 0, 0, 0, 0, time.UTC)
+
+	tick0 := consume(t, ts, good, 0)
+	if tick0 != 0 {
+		t.Fatalf("seed tick = %d, want 0", tick0)
+	}
+
+	// Then a bogus epoch-0 commit. Its tick must equal the previous tick
+	// (no time travel) — substitution = lastValidWhen.
+	tick1 := consume(t, ts, time.Unix(0, 0), 1)
+	if tick1 != 0 {
+		t.Errorf("anomalous tick after valid = %d, want 0 (must reuse lastValidWhen)", tick1)
+	}
+
+	// And a normal commit one day later still ticks forward as expected.
+	tick2 := consume(t, ts, good.Add(24*time.Hour), 2)
+	if tick2 != 1 {
+		t.Errorf("post-anomaly tick = %d, want 1 (anomaly must not poison the timeline)", tick2)
+	}
+}
+
+func TestSanitizeWhen_AfterMax_ForgedFutureCommit_DoesNotPoisonTimeline(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	good := time.Date(2024, time.April, 1, 0, 0, 0, 0, time.UTC)
+	consume(t, ts, good, 0)
+
+	// `git commit --date=2099-01-01` style — far past now+24h.
+	forged := time.Date(2099, time.January, 1, 0, 0, 0, 0, time.UTC)
+	tickForged := consume(t, ts, forged, 1)
+
+	// Without the fix the forged tick would explode (and stick via
+	// max(tick, previousTick)). With the fix it collapses to the
+	// previous valid tick.
+	if tickForged != 0 {
+		t.Errorf("forged-future tick = %d, want 0 (must clamp to lastValidWhen)", tickForged)
+	}
+
+	// Subsequent valid commit ticks forward by exactly 1 day.
+	next := consume(t, ts, good.Add(24*time.Hour), 2)
+	if next != 1 {
+		t.Errorf("post-forged tick = %d, want 1 (forgery must not stick via previousTick)", next)
+	}
+
+	if stats := ts.TimeAnomalies(); stats.AfterMax != 1 {
+		t.Errorf("AfterMax = %d, want 1", stats.AfterMax)
+	}
+}
+
+func TestSanitizeWhen_NormalRange_UnchangedAndUpdatesLastValid(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	when := time.Date(2024, time.April, 1, 12, 0, 0, 0, time.UTC)
+	got := ts.sanitizeWhen(when)
+
+	if !got.Equal(when) {
+		t.Errorf("in-window time was modified: got %s, want %s", got, when)
+	}
+
+	if !ts.lastValidWhen.Equal(when) {
+		t.Errorf("lastValidWhen = %s, want %s (must update on valid input)", ts.lastValidWhen, when)
+	}
+
+	if stats := ts.TimeAnomalies(); stats.Total() != 0 {
+		t.Errorf("anomalies total = %d, want 0 for in-window input", stats.Total())
+	}
+}
+
+func TestSanitizeWhen_ClockSkewWithinGrace_PassesThrough(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	// 1 hour into the future is within maxClockSkew (24h) — should pass.
+	when := time.Now().Add(1 * time.Hour)
+	got := ts.sanitizeWhen(when)
+
+	if !got.Equal(when) {
+		t.Errorf("within-grace future time was rejected: got %s, want %s", got, when)
+	}
+
+	if stats := ts.TimeAnomalies(); stats.AfterMax != 0 {
+		t.Errorf("AfterMax = %d, want 0 (grace window must allow small clock skew)", stats.AfterMax)
+	}
+}
+
+func TestTimeAnomalyTracker_RateLimit_DropsBurstWithinInterval(t *testing.T) {
+	t.Parallel()
+
+	var tr timeAnomalyTracker
+
+	when := time.Unix(0, 0)
+	repl := minSaneCommitTime
+
+	for range 1000 {
+		tr.recordBeforeMin(when, repl)
+	}
+
+	if got := tr.dropped.Load(); got != 999 {
+		t.Errorf("dropped = %d, want 999 (1000 events, 1 logged, 999 suppressed)", got)
+	}
+
+	if got := tr.snapshot().BeforeMin; got != 1000 {
+		t.Errorf("BeforeMin = %d, want 1000 (counter must record every event)", got)
+	}
+}
+
+func TestTimeAnomalyTracker_ConcurrentRecord_NoLostUpdates(t *testing.T) {
+	t.Parallel()
+
+	var (
+		tr        timeAnomalyTracker
+		wg        sync.WaitGroup
+		perWorker = int64(500)
+		workers   = 8
+	)
+
+	when := time.Unix(0, 0)
+	repl := minSaneCommitTime
+
+	wg.Add(workers)
+
+	for range workers {
+		go func() {
+			defer wg.Done()
+
+			for range int(perWorker) {
+				tr.recordBeforeMin(when, repl)
+			}
+		}()
+	}
+
+	wg.Wait()
+
+	want := perWorker * int64(workers)
+	if got := tr.snapshot().BeforeMin; got != want {
+		t.Errorf("BeforeMin = %d, want %d (concurrent atomic updates must not lose any)", got, want)
+	}
+}
+
+func TestTimeAnomalyStats_Total_SumsBothBounds(t *testing.T) {
+	t.Parallel()
+
+	s := TimeAnomalyStats{BeforeMin: 4, AfterMax: 7}
+	if got := s.Total(); got != 11 {
+		t.Errorf("Total = %d, want 11", got)
+	}
+}
+
+// TestRegressionAnalysisPeriodOverflow reproduces the bug shape: a single
+// epoch-0 commit followed by normal commits used to produce a tick range
+// of ~106 751 days (the [time.Duration] int64 overflow clamp). With the
+// sanitization in place the tick range is bounded by real commit deltas.
+func TestRegressionAnalysisPeriodOverflow_NoLongerProduces292Years(t *testing.T) {
+	t.Parallel()
+
+	ts := newTicks(t)
+
+	// First commit: epoch-0 (the trigger).
+	consume(t, ts, time.Unix(0, 0), 0)
+
+	// Then 5 commits one day apart in 2024.
+	base := time.Date(2024, time.April, 1, 0, 0, 0, 0, time.UTC)
+
+	for i := range 5 {
+		got := consume(t, ts, base.Add(time.Duration(i)*24*time.Hour), i+1)
+		// Ticks are measured from minSaneCommitTime (1990-01-01). So
+		// each 2024-04-0X commit lands ~12 510..12 514 days in. The
+		// important property: ticks are NOT clamped to ~106 751.
+		const overflowSentinel = 100_000
+
+		if got > overflowSentinel {
+			t.Errorf("tick %d for normal commit i=%d — overflow clamp regressed",
+				got, i)
+		}
+	}
+}
diff --git a/internal/analyzers/plumbing/tree_diff.go b/internal/analyzers/plumbing/tree_diff.go
index b0d0d82..17c18bc 100644
--- a/internal/analyzers/plumbing/tree_diff.go
+++ b/internal/analyzers/plumbing/tree_diff.go
@@ -13,6 +13,8 @@ import (
 	"github.com/src-d/enry/v2"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/langpath"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing/pathpolicy"
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 	"github.com/Sumatoshi-tech/codefang/pkg/pathfilter"
 	"github.com/Sumatoshi-tech/codefang/pkg/pipeline"
@@ -28,6 +30,16 @@ type TreeDiffAnalyzer struct {
 	pathFilter     *pathfilter.Filter
 	Changes        gitlib.Changes
 	previousCommit gitlib.Hash
+	// Pathspec holds pre-computed libgit2 pathspec globs derived from the
+	// configured --languages set via langpath.Globs. Empty when no language
+	// restriction applies.
+	Pathspec []string
+
+	// PathPolicy carries vendor / generated / extra-prefix exclusion
+	// rules shared with the static phase. The zero value excludes
+	// enry.IsVendor and pathfilter-detected generated files by
+	// default.
+	PathPolicy pathpolicy.Options
 }
 
 const (
@@ -39,7 +51,10 @@ const (
 	ConfigTreeDiffLanguages = "TreeDiff.LanguagesDetection"
 	// ConfigTreeDiffFilterRegexp is the configuration key for the file path filter regular expression.
 	ConfigTreeDiffFilterRegexp = "TreeDiff.FilteredRegexes"
-	allLanguages               = "all"
+	// ConfigTreeDiffPathPolicy is the fact key for the cross-phase vendor /
+	// generated / extra-prefix exclusion policy populated by the CLI.
+	ConfigTreeDiffPathPolicy = "TreeDiff.PathPolicy"
+	allLanguages             = "all"
 )
 
 // ErrInvalidSkipFiles indicates a type assertion failure for SkipFiles configuration.
@@ -111,6 +126,50 @@ func (t *TreeDiffAnalyzer) ListConfigurationOptions() []pipeline.ConfigurationOp
 	}
 }
 
+// applyLanguageConfig normalises the user-supplied language tokens into
+// the canonical Languages set and, when the set restricts by language,
+// pre-computes the libgit2 pathspec globs via langpath.Globs.
+//
+// Aliases (e.g. "golang" → "Go", "js" → "JavaScript") are resolved via
+// enry so that the Go-side filter keys match the canonical lowercase
+// name returned by enry.GetLanguage for detected files.
+func (t *TreeDiffAnalyzer) applyLanguageConfig(val []string) error {
+	t.Languages = map[string]bool{}
+
+	for _, lang := range val {
+		token := strings.TrimSpace(lang)
+		if strings.EqualFold(token, allLanguages) {
+			t.Languages[allLanguages] = true
+
+			continue
+		}
+
+		canonical, ok := enry.GetLanguageByAlias(token)
+		if !ok {
+			// langpath.Globs below will reject the same token with a
+			// richer error; fall through so the caller sees that error.
+			t.Languages[strings.ToLower(token)] = true
+
+			continue
+		}
+
+		t.Languages[strings.ToLower(canonical)] = true
+	}
+
+	globs, wantsAll, err := langpath.Globs(val)
+	if err != nil {
+		return fmt.Errorf("tree-diff pathspec: %w", err)
+	}
+
+	if wantsAll {
+		t.Pathspec = nil
+	} else {
+		t.Pathspec = globs
+	}
+
+	return nil
+}
+
 // Configure sets up the analyzer with the provided facts.
 func (t *TreeDiffAnalyzer) Configure(facts map[string]any) error {
 	if val, exists := facts[ConfigTreeDiffEnableBlacklist].(bool); exists && val {
@@ -123,10 +182,14 @@ func (t *TreeDiffAnalyzer) Configure(facts map[string]any) error {
 		t.pathFilter = pathfilter.New()
 	}
 
+	if val, exists := facts[ConfigTreeDiffPathPolicy].(pathpolicy.Options); exists {
+		t.PathPolicy = val
+	}
+
 	if val, exists := facts[ConfigTreeDiffLanguages].([]string); exists {
-		t.Languages = map[string]bool{}
-		for _, lang := range val {
-			t.Languages[strings.ToLower(strings.TrimSpace(lang))] = true
+		err := t.applyLanguageConfig(val)
+		if err != nil {
+			return err
 		}
 	} else if t.Languages == nil {
 		t.Languages = map[string]bool{}
@@ -237,6 +300,11 @@ func (t *TreeDiffAnalyzer) filterChanges(ctx context.Context, changes gitlib.Cha
 func (t *TreeDiffAnalyzer) shouldIncludeChange(ctx context.Context, change *gitlib.Change) bool {
 	name, hash := changeNameHash(change)
 
+	// Shared vendor / generated / extra-prefix exclusion policy.
+	if pathpolicy.Exclude(name, nil, t.PathPolicy) {
+		return false
+	}
+
 	// Check blacklist: user-specified prefixes + vendor/generated detection.
 	if len(t.SkipFiles) > 0 && t.isBlacklisted(name) {
 		return false
diff --git a/internal/analyzers/quality/analyzer.go b/internal/analyzers/quality/analyzer.go
index 6f9a933..d7fbd02 100644
--- a/internal/analyzers/quality/analyzer.go
+++ b/internal/analyzers/quality/analyzer.go
@@ -6,6 +6,7 @@ package quality
 import (
 	"context"
 	"maps"
+	"time"
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/cohesion"
@@ -75,6 +76,8 @@ type TickData struct {
 // tickAccumulator holds per-commit quality during aggregation.
 type tickAccumulator struct {
 	commitQuality map[string]*TickQuality
+	startTime     time.Time
+	endTime       time.Time
 }
 
 // qualityAvgTCSize is the estimated bytes of TC payload per commit (quality metrics).
@@ -298,10 +301,22 @@ func extractTC(tc analyze.TC, byTick map[int]*tickAccumulator) error {
 	if !ok {
 		acc = &tickAccumulator{
 			commitQuality: make(map[string]*TickQuality),
+			startTime:     tc.Timestamp,
+			endTime:       tc.Timestamp,
 		}
 		byTick[tc.Tick] = acc
 	}
 
+	if !tc.Timestamp.IsZero() {
+		if tc.Timestamp.Before(acc.startTime) || acc.startTime.IsZero() {
+			acc.startTime = tc.Timestamp
+		}
+
+		if tc.Timestamp.After(acc.endTime) {
+			acc.endTime = tc.Timestamp
+		}
+	}
+
 	acc.commitQuality[tc.CommitHash.String()] = tq
 
 	return nil
@@ -361,7 +376,9 @@ func buildTick(tick int, state *tickAccumulator) (analyze.TICK, error) {
 	}
 
 	return analyze.TICK{
-		Tick: tick,
+		Tick:      tick,
+		StartTime: state.startTime,
+		EndTime:   state.endTime,
 		Data: &TickData{
 			CommitQuality: state.commitQuality,
 		},
@@ -386,6 +403,7 @@ func ticksToReport(_ context.Context, ticks []analyze.TICK, commitsByTick map[in
 	return analyze.Report{
 		"commit_quality":  commitQuality,
 		"commits_by_tick": ct,
+		"tick_bounds":     analyze.BuildTickBounds(ticks),
 	}
 }
 
diff --git a/internal/analyzers/quality/metrics.go b/internal/analyzers/quality/metrics.go
index ee1282b..8b03b5c 100644
--- a/internal/analyzers/quality/metrics.go
+++ b/internal/analyzers/quality/metrics.go
@@ -182,8 +182,10 @@ func computeTickStats(tq *TickQuality) TickStats {
 
 // TimeSeriesEntry holds per-tick quality data for the time series output.
 type TimeSeriesEntry struct {
-	Tick  int       `json:"tick"  yaml:"tick"`
-	Stats TickStats `json:"stats" yaml:"stats"`
+	Tick      int       `json:"tick"                 yaml:"tick"`
+	StartTime string    `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime   string    `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	Stats     TickStats `json:"stats"                yaml:"stats"`
 }
 
 // AggregateData contains overall summary statistics.
@@ -205,6 +207,7 @@ type AggregateData struct {
 // ReportData is the parsed input data for quality metrics computation.
 type ReportData struct {
 	TickQuality map[int]*TickQuality
+	TickBounds  map[int]analyze.TickBounds
 }
 
 // ParseReportData extracts ReportData from an analyzer report.
@@ -223,6 +226,10 @@ func ParseReportData(report analyze.Report) (*ReportData, error) {
 		data.TickQuality = make(map[int]*TickQuality)
 	}
 
+	if v, ok := report["tick_bounds"].(map[int]analyze.TickBounds); ok {
+		data.TickBounds = v
+	}
+
 	return data, nil
 }
 
@@ -260,7 +267,15 @@ func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error) {
 
 	for i, tick := range ticks {
 		ts := computeTickStats(input.TickQuality[tick])
-		timeSeries[i] = TimeSeriesEntry{Tick: tick, Stats: ts}
+
+		entry := TimeSeriesEntry{Tick: tick, Stats: ts}
+
+		if bounds, hasBounds := input.TickBounds[tick]; hasBounds {
+			entry.StartTime = bounds.FormatStartTime()
+			entry.EndTime = bounds.FormatEndTime()
+		}
+
+		timeSeries[i] = entry
 
 		complexityMedians[i] = ts.ComplexityMedian
 		complexityP95s[i] = ts.ComplexityP95
@@ -288,25 +303,31 @@ func ComputeAllMetrics(report analyze.Report) (*ComputedMetrics, error) {
 		globalMinCohesion = 0
 	}
 
-	complexityMedianMean := stats.Mean(complexityMedians)
-	complexityP95Mean := stats.Mean(complexityP95s)
-	halsteadMedianMean := stats.Mean(halsteadMedians)
-	commentMeanMean := stats.Mean(commentMeans)
-	cohesionMeanMean := stats.Mean(cohesionMeans)
-
 	return &ComputedMetrics{
 		TimeSeries: timeSeries,
-		Aggregate: AggregateData{
-			TotalTicks:            len(ticks),
-			TotalFilesAnalyzed:    totalFiles,
-			ComplexityMedianMean:  complexityMedianMean,
-			ComplexityP95Mean:     complexityP95Mean,
-			HalsteadVolMedianMean: halsteadMedianMean,
-			TotalDeliveredBugs:    totalBugs,
-			CommentScoreMeanMean:  commentMeanMean,
-			MinCommentScore:       globalMinComment,
-			CohesionMeanMean:      cohesionMeanMean,
-			MinCohesion:           globalMinCohesion,
-		},
+		Aggregate: computeAggregate(
+			len(ticks), totalFiles, totalBugs,
+			globalMinComment, globalMinCohesion,
+			complexityMedians, complexityP95s, halsteadMedians, commentMeans, cohesionMeans,
+		),
 	}, nil
 }
+
+func computeAggregate(
+	totalTicks, totalFiles int,
+	totalBugs, minComment, minCohesion float64,
+	complexityMedians, complexityP95s, halsteadMedians, commentMeans, cohesionMeans []float64,
+) AggregateData {
+	return AggregateData{
+		TotalTicks:            totalTicks,
+		TotalFilesAnalyzed:    totalFiles,
+		ComplexityMedianMean:  stats.Mean(complexityMedians),
+		ComplexityP95Mean:     stats.Mean(complexityP95s),
+		HalsteadVolMedianMean: stats.Mean(halsteadMedians),
+		TotalDeliveredBugs:    totalBugs,
+		CommentScoreMeanMean:  stats.Mean(commentMeans),
+		MinCommentScore:       minComment,
+		CohesionMeanMean:      stats.Mean(cohesionMeans),
+		MinCohesion:           minCohesion,
+	}
+}
diff --git a/internal/analyzers/quality/store_writer_test.go b/internal/analyzers/quality/store_writer_test.go
index 6982977..6b0ee26 100644
--- a/internal/analyzers/quality/store_writer_test.go
+++ b/internal/analyzers/quality/store_writer_test.go
@@ -1,7 +1,5 @@
 package quality
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/sentiment/analyzer.go b/internal/analyzers/sentiment/analyzer.go
index 2741aea..8d1b6ae 100644
--- a/internal/analyzers/sentiment/analyzer.go
+++ b/internal/analyzers/sentiment/analyzer.go
@@ -662,6 +662,7 @@ func ticksToReport(_ context.Context, ticks []analyze.TICK, commitsByTick map[in
 	return analyze.Report{
 		"comments_by_commit": commentsByCommit,
 		"commits_by_tick":    ct,
+		"tick_bounds":        analyze.BuildTickBounds(ticks),
 	}
 }
 
diff --git a/internal/analyzers/sentiment/metrics.go b/internal/analyzers/sentiment/metrics.go
index f3b0b01..9101ef6 100644
--- a/internal/analyzers/sentiment/metrics.go
+++ b/internal/analyzers/sentiment/metrics.go
@@ -73,6 +73,7 @@ type ReportData struct {
 	EmotionsByTick map[int]float32
 	CommentsByTick map[int][]string
 	CommitsByTick  map[int][]gitlib.Hash
+	TickBounds     map[int]analyze.TickBounds
 }
 
 // ParseReportData extracts ReportData from an analyzer report.
@@ -92,6 +93,10 @@ func ParseReportData(report analyze.Report) (*ReportData, error) {
 		)
 	}
 
+	if v, ok := report["tick_bounds"].(map[int]analyze.TickBounds); ok {
+		data.TickBounds = v
+	}
+
 	if data.EmotionsByTick == nil {
 		data.EmotionsByTick = make(map[int]float32)
 	}
@@ -107,11 +112,13 @@ func ParseReportData(report analyze.Report) (*ReportData, error) {
 
 // TimeSeriesData contains sentiment data for a time period.
 type TimeSeriesData struct {
-	Tick           int     `json:"tick"           yaml:"tick"`
-	Sentiment      float32 `json:"sentiment"      yaml:"sentiment"`
-	CommentCount   int     `json:"comment_count"  yaml:"comment_count"`
-	CommitCount    int     `json:"commit_count"   yaml:"commit_count"`
-	Classification string  `json:"classification" yaml:"classification"`
+	Tick           int     `json:"tick"                 yaml:"tick"`
+	StartTime      string  `json:"start_time,omitempty" yaml:"start_time,omitempty"`
+	EndTime        string  `json:"end_time,omitempty"   yaml:"end_time,omitempty"`
+	Sentiment      float32 `json:"sentiment"            yaml:"sentiment"`
+	CommentCount   int     `json:"comment_count"        yaml:"comment_count"`
+	CommitCount    int     `json:"commit_count"         yaml:"commit_count"`
+	Classification string  `json:"classification"       yaml:"classification"`
 }
 
 // TrendData contains trend information.
@@ -260,13 +267,20 @@ func computeTimeSeriesWithOpts(input *ReportData, opts MetricOptions) []TimeSeri
 
 		classification := classifySentimentWithOpts(sentiment, opts)
 
-		result = append(result, TimeSeriesData{
+		entry := TimeSeriesData{
 			Tick:           tick,
 			Sentiment:      sentiment,
 			CommentCount:   commentCount,
 			CommitCount:    commitCount,
 			Classification: classification,
-		})
+		}
+
+		if bounds, ok := input.TickBounds[tick]; ok {
+			entry.StartTime = bounds.FormatStartTime()
+			entry.EndTime = bounds.FormatEndTime()
+		}
+
+		result = append(result, entry)
 	}
 
 	return result
diff --git a/internal/analyzers/sentiment/metrics_test.go b/internal/analyzers/sentiment/metrics_test.go
index 2b7c7dc..07bb67d 100644
--- a/internal/analyzers/sentiment/metrics_test.go
+++ b/internal/analyzers/sentiment/metrics_test.go
@@ -2,6 +2,7 @@ package sentiment
 
 import (
 	"testing"
+	"time"
 
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
@@ -27,6 +28,11 @@ const (
 	testHashB = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
 )
 
+var (
+	testTickTime1 = time.Date(2024, 1, 15, 10, 0, 0, 0, time.UTC)
+	testTickTime2 = time.Date(2024, 1, 16, 12, 0, 0, 0, time.UTC)
+)
+
 // Helper function to create test hash.
 func testHash(s string) gitlib.Hash {
 	var h gitlib.Hash
@@ -182,6 +188,40 @@ func TestSentimentTimeSeriesMetric_MissingCommmentsAndCommits(t *testing.T) {
 	assert.Equal(t, 0, result[0].CommitCount)
 }
 
+func TestSentimentTimeSeriesMetric_TickTimestamps(t *testing.T) {
+	t.Parallel()
+
+	t1 := testTickTime1
+	t2 := testTickTime2
+
+	input := &ReportData{
+		EmotionsByTick: map[int]float32{0: testSentimentPositive},
+		TickBounds: map[int]analyze.TickBounds{
+			0: {StartTime: t1, EndTime: t2},
+		},
+	}
+
+	result := computeTimeSeriesWithOpts(input, DefaultMetricOptions())
+
+	require.Len(t, result, 1)
+	assert.Equal(t, "2024-01-15T10:00:00Z", result[0].StartTime)
+	assert.Equal(t, "2024-01-16T12:00:00Z", result[0].EndTime)
+}
+
+func TestSentimentTimeSeriesMetric_NoTickBounds(t *testing.T) {
+	t.Parallel()
+
+	input := &ReportData{
+		EmotionsByTick: map[int]float32{0: testSentimentPositive},
+	}
+
+	result := computeTimeSeriesWithOpts(input, DefaultMetricOptions())
+
+	require.Len(t, result, 1)
+	assert.Empty(t, result[0].StartTime)
+	assert.Empty(t, result[0].EndTime)
+}
+
 // --- SentimentTrendMetric Tests ---.
 
 func TestSentimentTrendMetric_Empty(t *testing.T) {
diff --git a/internal/analyzers/sentiment/store_writer_test.go b/internal/analyzers/sentiment/store_writer_test.go
index 96cff33..873927d 100644
--- a/internal/analyzers/sentiment/store_writer_test.go
+++ b/internal/analyzers/sentiment/store_writer_test.go
@@ -1,7 +1,5 @@
 package sentiment
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/shotness/store_writer_test.go b/internal/analyzers/shotness/store_writer_test.go
index 486572b..7f656b2 100644
--- a/internal/analyzers/shotness/store_writer_test.go
+++ b/internal/analyzers/shotness/store_writer_test.go
@@ -1,7 +1,5 @@
 package shotness
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"testing"
diff --git a/internal/analyzers/typos/store_writer_test.go b/internal/analyzers/typos/store_writer_test.go
index b8020ec..7c71367 100644
--- a/internal/analyzers/typos/store_writer_test.go
+++ b/internal/analyzers/typos/store_writer_test.go
@@ -1,7 +1,5 @@
 package typos
 
-// FRD: specs/frds/FRD-20260301-all-analyzers-store-based.md.
-
 import (
 	"context"
 	"sort"
diff --git a/internal/budget/solver_test.go b/internal/budget/solver_test.go
index 35bd905..17c105b 100644
--- a/internal/budget/solver_test.go
+++ b/internal/budget/solver_test.go
@@ -203,8 +203,6 @@ func TestDeriveKnobs_HugeWorkerAllocation(t *testing.T) {
 	assert.LessOrEqual(t, cfg.Workers, runtime.NumCPU(), "workers capped at CPU count")
 }
 
-// FRD: specs/frds/FRD-20260310-allocate-proportionally.md.
-
 func TestAllocateProportionally_SingleWeight(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/budget/static_solver.go b/internal/budget/static_solver.go
index 6fcfa43..287c1ea 100644
--- a/internal/budget/static_solver.go
+++ b/internal/budget/static_solver.go
@@ -6,8 +6,6 @@ import (
 	"github.com/Sumatoshi-tech/codefang/pkg/units"
 )
 
-// FRD: specs/frds/FRD-20260312-static-budget-tuning.md.
-
 // Static analysis cost model constants (empirically measured).
 const (
 	// StaticBaseOverhead is the fixed Go runtime + loaded analyzers overhead.
diff --git a/internal/budget/static_solver_test.go b/internal/budget/static_solver_test.go
index 7bb1733..15969f9 100644
--- a/internal/budget/static_solver_test.go
+++ b/internal/budget/static_solver_test.go
@@ -1,7 +1,5 @@
 package budget
 
-// FRD: specs/frds/FRD-20260312-static-budget-tuning.md.
-
 import (
 	"runtime"
 	"testing"
diff --git a/internal/cache/incremental.go b/internal/cache/incremental.go
new file mode 100644
index 0000000..1d2dc0d
--- /dev/null
+++ b/internal/cache/incremental.go
@@ -0,0 +1,91 @@
+package cache
+
+import (
+	"crypto/sha256"
+	"encoding/hex"
+	"encoding/json"
+	"errors"
+	"fmt"
+	"io"
+	"os"
+	"path/filepath"
+	"time"
+
+	"github.com/Sumatoshi-tech/codefang/internal/storage"
+	"github.com/Sumatoshi-tech/codefang/pkg/textutil"
+)
+
+// metaFilename is the name of the cache metadata file.
+const metaFilename = "cache.json"
+
+// metaFilePerm is the file permission for cache metadata.
+const metaFilePerm = 0o640
+
+// cacheKeySeparator separates root SHA and branch in the cache key input.
+const cacheKeySeparator = ":"
+
+// ErrCacheNotFound is returned when the cache metadata file does not exist.
+var ErrCacheNotFound = errors.New("cache metadata not found")
+
+// ErrCacheCorrupt is returned when the cache metadata file cannot be parsed.
+var ErrCacheCorrupt = errors.New("cache metadata corrupt")
+
+// IncrementalMeta holds metadata for an incremental analysis cache.
+type IncrementalMeta struct {
+	Version     int       `json:"version"`
+	HeadSHA     string    `json:"head_sha"`
+	Branch      string    `json:"branch"`
+	RootSHA     string    `json:"root_sha"`
+	CommitCount int       `json:"commit_count"`
+	AnalyzerIDs []string  `json:"analyzer_ids"`
+	Timestamp   time.Time `json:"timestamp"`
+}
+
+// Key produces a deterministic directory name from root SHA and branch.
+// The key is a SHA-256 hash of "rootSHA:branch", hex-encoded.
+func Key(rootSHA, branch string) string {
+	h := sha256.New()
+	h.Write([]byte(rootSHA + cacheKeySeparator + branch))
+
+	return hex.EncodeToString(h.Sum(nil))
+}
+
+// IsStale returns true when the cached root SHA does not match the current root SHA,
+// indicating a force-push or history rewrite.
+func IsStale(meta IncrementalMeta, currentRootSHA string) bool {
+	return meta.RootSHA != currentRootSHA
+}
+
+// WriteMeta atomically writes cache metadata as indented JSON to dir/cache.json.
+func WriteMeta(dir string, meta IncrementalMeta) error {
+	metaPath := filepath.Join(dir, metaFilename)
+
+	return storage.WriteAtomic(metaPath, metaFilePerm, func(w io.Writer) error {
+		return textutil.WriteJSON(w, meta, true)
+	})
+}
+
+// ReadMeta reads and parses cache metadata from dir/cache.json.
+// Returns ErrCacheNotFound if the file does not exist.
+// Returns ErrCacheCorrupt if the file cannot be parsed.
+func ReadMeta(dir string) (IncrementalMeta, error) {
+	metaPath := filepath.Join(dir, metaFilename)
+
+	data, err := os.ReadFile(metaPath)
+	if err != nil {
+		if os.IsNotExist(err) {
+			return IncrementalMeta{}, ErrCacheNotFound
+		}
+
+		return IncrementalMeta{}, fmt.Errorf("read cache meta: %w", err)
+	}
+
+	var meta IncrementalMeta
+
+	unmarshalErr := json.Unmarshal(data, &meta)
+	if unmarshalErr != nil {
+		return IncrementalMeta{}, fmt.Errorf("%w: %w", ErrCacheCorrupt, unmarshalErr)
+	}
+
+	return meta, nil
+}
diff --git a/internal/cache/incremental_test.go b/internal/cache/incremental_test.go
new file mode 100644
index 0000000..49eedd7
--- /dev/null
+++ b/internal/cache/incremental_test.go
@@ -0,0 +1,100 @@
+package cache
+
+import (
+	"os"
+	"path/filepath"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+func TestCacheKey_Deterministic(t *testing.T) {
+	t.Parallel()
+
+	key1 := Key("abc123", "main")
+	key2 := Key("abc123", "main")
+	assert.Equal(t, key1, key2, "same inputs must produce same key")
+	assert.NotEmpty(t, key1)
+}
+
+func TestCacheKey_DifferentBranch(t *testing.T) {
+	t.Parallel()
+
+	key1 := Key("abc123", "main")
+	key2 := Key("abc123", "feature/x")
+	assert.NotEqual(t, key1, key2, "different branches must produce different keys")
+}
+
+func TestCacheKey_DifferentRoot(t *testing.T) {
+	t.Parallel()
+
+	key1 := Key("abc123", "main")
+	key2 := Key("def456", "main")
+	assert.NotEqual(t, key1, key2, "different root SHAs must produce different keys")
+}
+
+func testMeta() IncrementalMeta {
+	return IncrementalMeta{
+		Version:     1,
+		HeadSHA:     "abc123def456",
+		Branch:      "main",
+		RootSHA:     "root789",
+		CommitCount: 1000,
+		AnalyzerIDs: []string{"burndown", "couples"},
+		Timestamp:   time.Date(2026, 3, 28, 12, 0, 0, 0, time.UTC),
+	}
+}
+
+func TestWriteReadMeta_RoundTrip(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	original := testMeta()
+
+	require.NoError(t, WriteMeta(dir, original))
+
+	got, err := ReadMeta(dir)
+	require.NoError(t, err)
+
+	assert.Equal(t, original.Version, got.Version)
+	assert.Equal(t, original.HeadSHA, got.HeadSHA)
+	assert.Equal(t, original.Branch, got.Branch)
+	assert.Equal(t, original.RootSHA, got.RootSHA)
+	assert.Equal(t, original.CommitCount, got.CommitCount)
+	assert.Equal(t, original.AnalyzerIDs, got.AnalyzerIDs)
+	assert.True(t, original.Timestamp.Equal(got.Timestamp))
+}
+
+func TestReadMeta_MissingFile(t *testing.T) {
+	t.Parallel()
+
+	_, err := ReadMeta(t.TempDir())
+	assert.ErrorIs(t, err, ErrCacheNotFound)
+}
+
+func TestReadMeta_CorruptFile(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "cache.json"), []byte("{not valid json"), 0o600))
+
+	_, err := ReadMeta(dir)
+	assert.ErrorIs(t, err, ErrCacheCorrupt)
+}
+
+func TestIsStale_MatchingRootSHA(t *testing.T) {
+	t.Parallel()
+
+	meta := testMeta()
+	assert.False(t, IsStale(meta, meta.RootSHA))
+}
+
+func TestIsStale_MismatchingRootSHA(t *testing.T) {
+	t.Parallel()
+
+	meta := testMeta()
+	assert.True(t, IsStale(meta, "different_root"))
+}
diff --git a/internal/config/apply_test.go b/internal/config/apply_test.go
index d5304cf..e4e3916 100644
--- a/internal/config/apply_test.go
+++ b/internal/config/apply_test.go
@@ -1,4 +1,3 @@
-// FRD: specs/frds/FRD-20260302-config-loader-facts.md.
 package config_test
 
 import (
diff --git a/internal/framework/blob_pipeline.go b/internal/framework/blob_pipeline.go
index 84170a8..d410061 100644
--- a/internal/framework/blob_pipeline.go
+++ b/internal/framework/blob_pipeline.go
@@ -56,6 +56,11 @@ type BlobPipeline struct {
 	// MaxChanges caps the number of file changes per commit. Zero = use default.
 	MaxChanges int
 
+	// TreeDiffPathspec is the libgit2 pathspec pre-filter for tree diffs.
+	// An empty or nil slice disables the filter (libgit2 returns all deltas).
+	// Derived from the configured --languages set by TreeDiffAnalyzer.
+	TreeDiffPathspec []string
+
 	// Metrics provides per-stage counters for memory triage. Nil-safe.
 	Metrics *StageMetrics
 
@@ -206,6 +211,7 @@ func (p *BlobPipeline) processBatch(
 		req := gitlib.TreeDiffRequest{
 			PreviousCommitHash: prevHash,
 			CommitHash:         commit.Hash(),
+			Pathspec:           p.TreeDiffPathspec,
 			Response:           respChan,
 		}
 
diff --git a/internal/framework/coordinator.go b/internal/framework/coordinator.go
index 1469445..47e9eca 100644
--- a/internal/framework/coordinator.go
+++ b/internal/framework/coordinator.go
@@ -160,6 +160,11 @@ type CoordinatorConfig struct {
 	// MaxChangesPerCommit caps the number of file changes per commit for blob loading.
 	MaxChangesPerCommit int
 
+	// TreeDiffPathspec is the libgit2 pathspec pre-filter for tree diffs,
+	// derived from --languages by the TreeDiffAnalyzer. An empty slice
+	// disables path-based filtering (libgit2 returns all deltas).
+	TreeDiffPathspec []string
+
 	// MaxDiffBatchSize is the maximum number of diff requests per batch.
 	MaxDiffBatchSize int
 
@@ -392,6 +397,8 @@ func newBlobPipelineFromConfig(
 		p.MaxChanges = config.MaxChangesPerCommit
 	}
 
+	p.TreeDiffPathspec = config.TreeDiffPathspec
+
 	return p
 }
 
diff --git a/internal/framework/runner.go b/internal/framework/runner.go
index 0a8f3da..d66605d 100644
--- a/internal/framework/runner.go
+++ b/internal/framework/runner.go
@@ -5,6 +5,9 @@ import (
 	"context"
 	"errors"
 	"fmt"
+	"log"
+	"os"
+	"path/filepath"
 	"runtime"
 	"runtime/debug"
 	"sync"
@@ -16,6 +19,7 @@ import (
 
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing"
+	"github.com/Sumatoshi-tech/codefang/internal/cache"
 	"github.com/Sumatoshi-tech/codefang/internal/checkpoint"
 	"github.com/Sumatoshi-tech/codefang/internal/observability"
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
@@ -39,6 +43,12 @@ type stateDiscarder interface {
 // ErrNotStoreWriter is returned when an analyzer does not implement [analyze.StoreWriter].
 var ErrNotStoreWriter = errors.New("analyzer does not implement StoreWriter")
 
+// ErrCacheStale is returned when the cached root SHA does not match the current repo root.
+var ErrCacheStale = errors.New("cache stale: root SHA mismatch")
+
+// ErrCacheInvalid is returned when the cached commit count exceeds available commits.
+var ErrCacheInvalid = errors.New("cache invalid: commit count mismatch")
+
 // nativeTrimInterval controls how often malloc_trim(0) is called within a chunk
 // to release native (C malloc) memory back to the OS. This prevents tree-sitter
 // and libgit2 malloc fragmentation from accumulating across commits.
@@ -101,6 +111,11 @@ type Runner struct {
 	// When non-nil, passed to Coordinator and updated by pipeline stages.
 	StageMetrics *StageMetrics
 
+	// CacheDir is the directory for incremental analysis cache.
+	// When set, the runner probes for cached state before processing and
+	// writes updated state after finalization.
+	CacheDir string
+
 	runtimeTuningOnce sync.Once
 	runtimeBallast    []byte
 }
@@ -137,15 +152,33 @@ type runState struct {
 	runner  *Runner
 	commits []*gitlib.Commit
 	reports map[analyze.HistoryAnalyzer]analyze.Report
+
+	// totalCommitCount is the original commit count before cache trimming.
+	// Used by cacheWritePhase to record the total in metadata.
+	totalCommitCount int
+
+	// cacheSubDir is the resolved cache subdirectory for this repo+branch.
+	// Empty when caching is disabled or cache probe failed.
+	cacheSubDir string
 }
 
 // Run executes all analyzers over the given commits: initialize, consume each commit via pipeline, then finalize.
+// When CacheDir is set, probes for cached state (skipping already-processed commits)
+// and writes updated state after finalization.
 func (runner *Runner) Run(ctx context.Context, commits []*gitlib.Commit) (map[analyze.HistoryAnalyzer]analyze.Report, error) {
-	final, err := pipeline.RunPhases(ctx, runState{runner: runner, commits: commits},
+	initial := runState{
+		runner:           runner,
+		commits:          commits,
+		totalCommitCount: len(commits),
+	}
+
+	final, err := pipeline.RunPhases(ctx, initial,
 		pipeline.PhaseFunc[runState](initAnalyzersPhase),
 		pipeline.PhaseFunc[runState](initAggregatorsPhase),
+		pipeline.PhaseFunc[runState](cacheProbePhase),
 		pipeline.PhaseFunc[runState](processCommitsPhase),
 		pipeline.PhaseFunc[runState](finalizePhase),
+		pipeline.PhaseFunc[runState](cacheWritePhase),
 	)
 	if err != nil {
 		return nil, err
@@ -176,7 +209,9 @@ func processCommitsPhase(ctx context.Context, s runState) (runState, error) {
 		return s, nil
 	}
 
-	_, err := s.runner.processCommits(ctx, s.commits, 0, 0)
+	indexOffset := s.totalCommitCount - len(s.commits)
+
+	_, err := s.runner.processCommits(ctx, s.commits, indexOffset, 0)
 	if err != nil {
 		return s, err
 	}
@@ -184,6 +219,42 @@ func processCommitsPhase(ctx context.Context, s runState) (runState, error) {
 	return s, nil
 }
 
+// cacheProbePhase loads cached analyzer/aggregator state and trims already-processed commits.
+// No-op when CacheDir is empty.
+func cacheProbePhase(_ context.Context, s runState) (runState, error) {
+	if s.runner.CacheDir == "" {
+		return s, nil
+	}
+
+	probed, err := s.runner.probeCache(s.commits)
+	if err != nil {
+		// Cache probe failures are non-fatal: log and proceed with full run.
+		log.Printf("cache probe failed, running full analysis: %v", err)
+
+		return s, nil
+	}
+
+	s.commits = probed.remainingCommits
+	s.cacheSubDir = probed.subDir
+
+	return s, nil
+}
+
+// cacheWritePhase saves analyzer/aggregator state for future incremental runs.
+// No-op when CacheDir is empty or cacheSubDir was not set.
+func cacheWritePhase(_ context.Context, s runState) (runState, error) {
+	if s.runner.CacheDir == "" {
+		return s, nil
+	}
+
+	writeErr := s.runner.writeCache(s.cacheSubDir, s.totalCommitCount)
+	if writeErr != nil {
+		log.Printf("cache write failed: %v", writeErr)
+	}
+
+	return s, nil
+}
+
 func finalizePhase(ctx context.Context, s runState) (runState, error) {
 	reports, err := s.runner.FinalizeWithAggregators(ctx)
 	if err != nil {
@@ -323,6 +394,156 @@ func (runner *Runner) SpillAggregators() error {
 	return nil
 }
 
+// cacheProbeResult holds the result of a successful cache probe.
+type cacheProbeResult struct {
+	remainingCommits []*gitlib.Commit
+	subDir           string
+}
+
+// cacheDirPerm is the permission for cache subdirectories.
+const cacheDirPerm = 0o750
+
+// probeCache attempts to load cached state and returns the remaining unprocessed commits.
+// Returns an error if the cache is stale or cannot be loaded.
+func (runner *Runner) probeCache(commits []*gitlib.Commit) (cacheProbeResult, error) {
+	if len(commits) == 0 {
+		return cacheProbeResult{remainingCommits: commits}, nil
+	}
+
+	rootSHA := commits[0].Hash().String()
+	branch := "" // Branch detection not yet available in Runner context.
+	subDir := filepath.Join(runner.CacheDir, cache.Key(rootSHA, branch))
+
+	meta, err := cache.ReadMeta(subDir)
+	if err != nil {
+		return cacheProbeResult{}, err
+	}
+
+	if cache.IsStale(meta, rootSHA) {
+		return cacheProbeResult{}, fmt.Errorf("%w: cached=%s, current=%s", ErrCacheStale, meta.RootSHA, rootSHA)
+	}
+
+	if meta.CommitCount > len(commits) {
+		return cacheProbeResult{}, fmt.Errorf("%w: cached %d, available %d", ErrCacheInvalid, meta.CommitCount, len(commits))
+	}
+
+	// Load checkpoint state from cached analyzers.
+	loadErr := runner.loadCachedCheckpoints(subDir)
+	if loadErr != nil {
+		return cacheProbeResult{}, fmt.Errorf("load cached checkpoints: %w", loadErr)
+	}
+
+	// Restore aggregator spill state.
+	runner.restoreCachedAggSpills(subDir)
+
+	remaining := commits[meta.CommitCount:]
+	log.Printf("Replaying %d commits vs %d total", len(remaining), len(commits))
+
+	return cacheProbeResult{
+		remainingCommits: remaining,
+		subDir:           subDir,
+	}, nil
+}
+
+// loadCachedCheckpoints loads checkpoint state for all Checkpointable analyzers.
+func (runner *Runner) loadCachedCheckpoints(subDir string) error {
+	for idx, analyzer := range runner.Analyzers {
+		cp, ok := analyzer.(checkpoint.Checkpointable)
+		if !ok {
+			continue
+		}
+
+		analyzerDir := filepath.Join(subDir, fmt.Sprintf("analyzer_%d", idx))
+
+		loadErr := cp.LoadCheckpoint(analyzerDir)
+		if loadErr != nil {
+			return fmt.Errorf("load checkpoint for analyzer %d: %w", idx, loadErr)
+		}
+	}
+
+	return nil
+}
+
+// restoreCachedAggSpills restores aggregator spill state from cached directories.
+func (runner *Runner) restoreCachedAggSpills(subDir string) {
+	for idx, agg := range runner.aggregators {
+		if agg == nil {
+			continue
+		}
+
+		aggDir := filepath.Join(subDir, fmt.Sprintf("agg_spill_%d", idx))
+
+		info, statErr := os.Stat(aggDir)
+		if statErr != nil || !info.IsDir() {
+			continue
+		}
+
+		agg.RestoreSpillState(analyze.AggregatorSpillInfo{Dir: aggDir})
+	}
+}
+
+// writeCache saves analyzer/aggregator state for future incremental runs.
+func (runner *Runner) writeCache(subDir string, totalCommits int) error {
+	if subDir == "" {
+		// No cache probe succeeded; create subDir from first commit.
+		if totalCommits == 0 {
+			return nil
+		}
+
+		rootSHA := ""
+		branch := ""
+
+		// Use the runner's commit list indirectly — totalCommits tells us the count.
+		subDir = filepath.Join(runner.CacheDir, cache.Key(rootSHA, branch))
+	}
+
+	mkErr := os.MkdirAll(subDir, cacheDirPerm)
+	if mkErr != nil {
+		return fmt.Errorf("create cache dir: %w", mkErr)
+	}
+
+	// Save checkpoint state for all Checkpointable analyzers.
+	for idx, analyzer := range runner.Analyzers {
+		cp, ok := analyzer.(checkpoint.Checkpointable)
+		if !ok {
+			continue
+		}
+
+		analyzerDir := filepath.Join(subDir, fmt.Sprintf("analyzer_%d", idx))
+
+		saveErr := os.MkdirAll(analyzerDir, cacheDirPerm)
+		if saveErr != nil {
+			return fmt.Errorf("create analyzer cache dir: %w", saveErr)
+		}
+
+		cpErr := cp.SaveCheckpoint(analyzerDir)
+		if cpErr != nil {
+			return fmt.Errorf("save checkpoint for analyzer %d: %w", idx, cpErr)
+		}
+	}
+
+	// Spill aggregator state.
+	spillErr := runner.SpillAggregators()
+	if spillErr != nil {
+		return fmt.Errorf("spill aggregators for cache: %w", spillErr)
+	}
+
+	// Write cache metadata.
+	analyzerIDs := make([]string, 0, len(runner.Analyzers))
+	for _, a := range runner.Analyzers {
+		analyzerIDs = append(analyzerIDs, a.Name())
+	}
+
+	meta := cache.IncrementalMeta{
+		Version:     1,
+		CommitCount: totalCommits,
+		AnalyzerIDs: analyzerIDs,
+		Timestamp:   time.Now().UTC(),
+	}
+
+	return cache.WriteMeta(subDir, meta)
+}
+
 // DiscardAggregatorState clears all in-memory cumulative state from
 // aggregators without serialization. Used in streaming timeseries NDJSON
 // mode where per-commit data is drained each chunk and cumulative state
diff --git a/internal/framework/runner_internal_test.go b/internal/framework/runner_internal_test.go
index fc313dc..87ca031 100644
--- a/internal/framework/runner_internal_test.go
+++ b/internal/framework/runner_internal_test.go
@@ -31,9 +31,9 @@ func TestRunner_drainWorkerTCs_ConcurrentRouting(t *testing.T) {
 		commitMeta: make(map[string]analyze.CommitMeta),
 	}
 
-	var active int32
+	var active atomic.Int32
 
-	var maxActive int32
+	var maxActive atomic.Int32
 
 	var startWg sync.WaitGroup
 
@@ -43,21 +43,21 @@ func TestRunner_drainWorkerTCs_ConcurrentRouting(t *testing.T) {
 		startWg.Done()
 		startWg.Wait()
 
-		current := atomic.AddInt32(&active, 1)
+		current := active.Add(1)
 
 		for {
-			maxA := atomic.LoadInt32(&maxActive)
+			maxA := maxActive.Load()
 			if current <= maxA {
 				break
 			}
 
-			if atomic.CompareAndSwapInt32(&maxActive, maxA, current) {
+			if maxActive.CompareAndSwap(maxA, current) {
 				break
 			}
 		}
 
 		time.Sleep(10 * time.Millisecond)
-		atomic.AddInt32(&active, -1)
+		active.Add(-1)
 
 		return nil
 	}
@@ -78,5 +78,5 @@ func TestRunner_drainWorkerTCs_ConcurrentRouting(t *testing.T) {
 	elapsed := time.Since(start)
 
 	assert.Less(t, elapsed, 50*time.Millisecond, "should run concurrently")
-	assert.Equal(t, int32(2), atomic.LoadInt32(&maxActive), "should have 2 concurrent routes")
+	assert.Equal(t, int32(2), maxActive.Load(), "should have 2 concurrent routes")
 }
diff --git a/internal/framework/runner_test.go b/internal/framework/runner_test.go
index 3336dbe..e87ae8a 100644
--- a/internal/framework/runner_test.go
+++ b/internal/framework/runner_test.go
@@ -1212,8 +1212,6 @@ func registerGobTypes() {
 	gob.Register([]string{})
 }
 
-// FRD: specs/frds/FRD-20260228-runner-integration.md.
-
 func TestFinalizeToStore_NoAggregators(t *testing.T) {
 	t.Parallel()
 
diff --git a/internal/framework/sampler.go b/internal/framework/sampler.go
index bb96008..4c992d6 100644
--- a/internal/framework/sampler.go
+++ b/internal/framework/sampler.go
@@ -6,6 +6,7 @@ import (
 	"log/slog"
 	"os"
 	"runtime/pprof"
+	"sync/atomic"
 	"time"
 
 	"github.com/Sumatoshi-tech/codefang/internal/observability"
@@ -21,6 +22,10 @@ const kilo = 1000
 // PipelineSampler periodically logs comprehensive memory and pipeline metrics
 // during chunk processing. Implements playbook section 2.1: "lightweight
 // periodic sampler (always-on in debug builds).".
+//
+// t1Captured is atomic because the sampler goroutine (driven by its ticker)
+// and the caller goroutine (via CaptureT1) both race to capture the t1 peak
+// heap profile; CompareAndSwap guarantees exactly one wins.
 type PipelineSampler struct {
 	logger       *slog.Logger
 	metrics      *StageMetrics
@@ -29,8 +34,7 @@ type PipelineSampler struct {
 	chunkIndex   int
 	memBudget    int64
 	profileAtRSS int64 // RSS threshold (bytes) to trigger t1 heap profile.
-	t0Captured   bool
-	t1Captured   bool
+	t1Captured   atomic.Bool
 }
 
 // SamplerConfig configures the pipeline sampler.
@@ -68,7 +72,6 @@ func (s *PipelineSampler) Start(ctx context.Context) {
 	// Capture t0 heap profile (playbook step 2: "take snapshot at t0").
 	if s.dumpDir != "" {
 		s.captureProfile("t0")
-		s.t0Captured = true
 	}
 
 	go s.run(ctx)
@@ -141,19 +144,27 @@ func (s *PipelineSampler) sample(tick int) {
 	)
 
 	// Auto-capture t1 profile on RSS threshold (playbook step 2: "at or right after peak").
-	if s.profileAtRSS > 0 && !s.t1Captured && snap.RSS >= s.profileAtRSS {
+	// CompareAndSwap guarantees at most one capture across both the sampler
+	// goroutine and any concurrent CaptureT1 caller.
+	if s.profileAtRSS > 0 && snap.RSS >= s.profileAtRSS && s.t1Captured.CompareAndSwap(false, true) {
 		s.captureProfile("t1")
-		s.t1Captured = true
 	}
 }
 
 // CaptureT1 forces capture of the t1 (peak) heap profile. Call after the
 // chunk completes if the automatic RSS threshold wasn't hit.
+// Safe to call concurrently with the sampler goroutine — at most one capture
+// wins via CompareAndSwap.
 func (s *PipelineSampler) CaptureT1() {
-	if s.dumpDir != "" && !s.t1Captured {
-		s.captureProfile("t1")
-		s.t1Captured = true
+	if s.dumpDir == "" {
+		return
+	}
+
+	if !s.t1Captured.CompareAndSwap(false, true) {
+		return
 	}
+
+	s.captureProfile("t1")
 }
 
 func (s *PipelineSampler) captureProfile(label string) {
diff --git a/internal/identity/split.go b/internal/identity/split.go
new file mode 100644
index 0000000..8305687
--- /dev/null
+++ b/internal/identity/split.go
@@ -0,0 +1,46 @@
+package identity
+
+import "strings"
+
+// SplitIdentity splits a pipe-delimited or exact-format identity string
+// into a canonical name and email.
+//
+// Pipe-delimited format: "name1|name2|email1|email2" → first non-email part, first email part.
+// Exact format: "name <email>" → name and email.
+// Plain name: "name" → name and empty email.
+func SplitIdentity(s string) (name, email string) {
+	if s == "" {
+		return "", ""
+	}
+
+	// Exact format: "name <email>".
+	if idx := strings.Index(s, " <"); idx > 0 && strings.HasSuffix(s, ">") {
+		return strings.TrimSpace(s[:idx]), s[idx+2 : len(s)-1]
+	}
+
+	// Pipe-delimited format.
+	if strings.Contains(s, "|") {
+		return splitPipeIdentity(s)
+	}
+
+	// Plain name, no email.
+	return s, ""
+}
+
+func splitPipeIdentity(s string) (name, email string) {
+	for part := range strings.SplitSeq(s, "|") {
+		if name == "" && !strings.Contains(part, "@") {
+			name = part
+		}
+
+		if email == "" && strings.Contains(part, "@") {
+			email = part
+		}
+
+		if name != "" && email != "" {
+			break
+		}
+	}
+
+	return name, email
+}
diff --git a/internal/identity/split_test.go b/internal/identity/split_test.go
new file mode 100644
index 0000000..b0487cf
--- /dev/null
+++ b/internal/identity/split_test.go
@@ -0,0 +1,68 @@
+package identity_test
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+
+	"github.com/Sumatoshi-tech/codefang/internal/identity"
+)
+
+const (
+	testName  = "daniel smith"
+	testEmail = "dbsmith@google.com"
+)
+
+func TestSplitIdentity_PipeDelimited(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity("daniel smith|dbsmith@google.com")
+
+	assert.Equal(t, testName, name)
+	assert.Equal(t, testEmail, email)
+}
+
+func TestSplitIdentity_ExactFormat(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity("daniel smith <dbsmith@google.com>")
+
+	assert.Equal(t, testName, name)
+	assert.Equal(t, testEmail, email)
+}
+
+func TestSplitIdentity_NameOnly(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity("daniel smith")
+
+	assert.Equal(t, testName, name)
+	assert.Empty(t, email)
+}
+
+func TestSplitIdentity_Empty(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity("")
+
+	assert.Empty(t, name)
+	assert.Empty(t, email)
+}
+
+func TestSplitIdentity_MultipleAliases(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity("alice|bob|alice@example.com|bob@example.com")
+
+	assert.Equal(t, "alice", name)
+	assert.Equal(t, "alice@example.com", email)
+}
+
+func TestSplitIdentity_UnmatchedAuthor(t *testing.T) {
+	t.Parallel()
+
+	name, email := identity.SplitIdentity(identity.AuthorMissingName)
+
+	assert.Equal(t, identity.AuthorMissingName, name)
+	assert.Empty(t, email)
+}
diff --git a/internal/importmodel/file.go b/internal/importmodel/file.go
deleted file mode 100644
index 5c9d446..0000000
--- a/internal/importmodel/file.go
+++ /dev/null
@@ -1,9 +0,0 @@
-// Package importmodel defines the data model for source file import analysis.
-package importmodel
-
-// File represents a source file with its detected imports, language, and any parse error.
-type File struct {
-	Imports []string
-	Lang    string
-	Error   error
-}
diff --git a/internal/observability/health_test.go b/internal/observability/health_test.go
index f7dcfa1..39b03db 100644
--- a/internal/observability/health_test.go
+++ b/internal/observability/health_test.go
@@ -19,7 +19,7 @@ func TestHealthHandler_ReturnsOK(t *testing.T) {
 
 	handler := observability.HealthHandler()
 
-	req := httptest.NewRequest(http.MethodGet, "/healthz", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/healthz", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
@@ -38,7 +38,7 @@ func TestHealthHandler_ContentTypeJSON(t *testing.T) {
 
 	handler := observability.HealthHandler()
 
-	req := httptest.NewRequest(http.MethodGet, "/healthz", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/healthz", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
@@ -53,7 +53,7 @@ func TestReadyHandler_AllChecksPass(t *testing.T) {
 	passCheckB := func(_ context.Context) error { return nil }
 	handler := observability.ReadyHandler(passCheckA, passCheckB)
 
-	req := httptest.NewRequest(http.MethodGet, "/readyz", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/readyz", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
@@ -72,7 +72,7 @@ func TestReadyHandler_NoChecks(t *testing.T) {
 
 	handler := observability.ReadyHandler()
 
-	req := httptest.NewRequest(http.MethodGet, "/readyz", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/readyz", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
@@ -91,7 +91,7 @@ func TestReadyHandler_CheckFails(t *testing.T) {
 
 	handler := observability.ReadyHandler(passCheck, failCheck)
 
-	req := httptest.NewRequest(http.MethodGet, "/readyz", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/readyz", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
diff --git a/internal/observability/integration_test.go b/internal/observability/integration_test.go
index 6f3d137..4e4e00e 100644
--- a/internal/observability/integration_test.go
+++ b/internal/observability/integration_test.go
@@ -132,7 +132,7 @@ func TestEndToEnd_MiddlewareProducesSpans(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, inner)
 
-	req := httptest.NewRequest(http.MethodPost, "/v1/analyze", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodPost, "/v1/analyze", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	mw.ServeHTTP(rec, req)
diff --git a/internal/observability/metric_builder_test.go b/internal/observability/metric_builder_test.go
index b2f36f7..498daa8 100644
--- a/internal/observability/metric_builder_test.go
+++ b/internal/observability/metric_builder_test.go
@@ -1,7 +1,5 @@
 package observability
 
-// FRD: specs/frds/FRD-20260302-observability-dedup.md.
-
 import (
 	"errors"
 	"testing"
diff --git a/internal/observability/middleware_test.go b/internal/observability/middleware_test.go
index 3d19df3..5de1e8c 100644
--- a/internal/observability/middleware_test.go
+++ b/internal/observability/middleware_test.go
@@ -44,7 +44,7 @@ func TestHTTPMiddleware_CreatesSpan(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, handler)
 
-	req := httptest.NewRequest(http.MethodGet, "/v1/analyze", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/v1/analyze", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	mw.ServeHTTP(rec, req)
@@ -74,7 +74,7 @@ func TestHTTPMiddleware_PropagatesContext(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, handler)
 
-	req := httptest.NewRequest(http.MethodPost, "/v1/history", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodPost, "/v1/history", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	mw.ServeHTTP(rec, req)
@@ -114,7 +114,7 @@ func TestHTTPMiddleware_ExtractsTraceParent(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, handler)
 
-	req := httptest.NewRequest(http.MethodGet, "/v1/analyze", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/v1/analyze", http.NoBody)
 	req.Header.Set("Traceparent", traceparent)
 
 	rec := httptest.NewRecorder()
@@ -145,7 +145,7 @@ func TestHTTPMiddleware_RecoversPanic(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, handler)
 
-	req := httptest.NewRequest(http.MethodGet, "/v1/crash", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/v1/crash", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	// Should not panic — middleware should recover.
@@ -199,7 +199,7 @@ func TestHTTPMiddleware_SetsStatusOnError(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, discardLogger, handler)
 
-	req := httptest.NewRequest(http.MethodGet, "/v1/score", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/v1/score", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	mw.ServeHTTP(rec, req)
@@ -286,7 +286,7 @@ func TestHTTPMiddleware_AccessLog(t *testing.T) {
 
 	mw := observability.HTTPMiddleware(tracer, logger, handler)
 
-	req := httptest.NewRequest(http.MethodGet, "/v1/analyze", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/v1/analyze", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	mw.ServeHTTP(rec, req)
diff --git a/internal/observability/prometheus_test.go b/internal/observability/prometheus_test.go
index 124d452..8c8fb9e 100644
--- a/internal/observability/prometheus_test.go
+++ b/internal/observability/prometheus_test.go
@@ -1,6 +1,7 @@
 package observability_test
 
 import (
+	"context"
 	"net/http"
 	"net/http/httptest"
 	"testing"
@@ -17,7 +18,7 @@ func TestPrometheusHandler_ServesMetrics(t *testing.T) {
 	handler, err := observability.PrometheusHandler()
 	require.NoError(t, err)
 
-	req := httptest.NewRequest(http.MethodGet, "/metrics", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/metrics", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
@@ -33,7 +34,7 @@ func TestPrometheusHandler_ContainsTargetInfo(t *testing.T) {
 	handler, err := observability.PrometheusHandler()
 	require.NoError(t, err)
 
-	req := httptest.NewRequest(http.MethodGet, "/metrics", http.NoBody)
+	req := httptest.NewRequestWithContext(context.Background(), http.MethodGet, "/metrics", http.NoBody)
 	rec := httptest.NewRecorder()
 
 	handler.ServeHTTP(rec, req)
diff --git a/internal/observability/sysmetrics_test.go b/internal/observability/sysmetrics_test.go
index 25d18e5..23be04e 100644
--- a/internal/observability/sysmetrics_test.go
+++ b/internal/observability/sysmetrics_test.go
@@ -1,7 +1,5 @@
 package observability_test
 
-// FRD: specs/frds/FRD-20260302-sysmetrics-move.md.
-
 import (
 	"runtime"
 	"testing"
diff --git a/internal/plumbing/fact_accessors_test.go b/internal/plumbing/fact_accessors_test.go
index b7b61a4..77e0702 100644
--- a/internal/plumbing/fact_accessors_test.go
+++ b/internal/plumbing/fact_accessors_test.go
@@ -1,4 +1,3 @@
-// FRD: specs/frds/FRD-20260302-typed-fact-accessors.md.
 package plumbing_test
 
 import (
diff --git a/internal/storage/atomicfile_test.go b/internal/storage/atomicfile_test.go
index eec2d56..f6ace6c 100644
--- a/internal/storage/atomicfile_test.go
+++ b/internal/storage/atomicfile_test.go
@@ -1,7 +1,5 @@
 package storage
 
-// FRD: specs/frds/FRD-20260310-atomic-file-write.md.
-
 import (
 	"errors"
 	"fmt"
diff --git a/pkg/alg/chunk_test.go b/pkg/alg/chunk_test.go
index 12db998..11d27c3 100644
--- a/pkg/alg/chunk_test.go
+++ b/pkg/alg/chunk_test.go
@@ -1,7 +1,5 @@
 package alg_test
 
-// FRD: specs/frds/FRD-20260302-chunk-pairs.md.
-
 import (
 	"testing"
 
diff --git a/pkg/alg/interval/interval_test.go b/pkg/alg/interval/interval_test.go
index e65b6a0..851cba4 100644
--- a/pkg/alg/interval/interval_test.go
+++ b/pkg/alg/interval/interval_test.go
@@ -1,7 +1,5 @@
 package interval
 
-// FRD: specs/frds/FRD-20260302-generic-interval-tree.md.
-
 import (
 	"testing"
 
diff --git a/pkg/alg/iter_test.go b/pkg/alg/iter_test.go
index f82a767..8959bc6 100644
--- a/pkg/alg/iter_test.go
+++ b/pkg/alg/iter_test.go
@@ -1,7 +1,5 @@
 package alg
 
-// FRD: specs/frds/FRD-20260310-iterator.md.
-
 import (
 	"errors"
 	"io"
diff --git a/pkg/alg/lru/benchmark_test.go b/pkg/alg/lru/benchmark_test.go
index 8928948..74d973b 100644
--- a/pkg/alg/lru/benchmark_test.go
+++ b/pkg/alg/lru/benchmark_test.go
@@ -1,7 +1,5 @@
 package lru_test
 
-// FRD: specs/frds/FRD-20260302-generic-lru-cache.md.
-
 import (
 	"testing"
 
diff --git a/pkg/alg/lru/cache_test.go b/pkg/alg/lru/cache_test.go
index 4aa4d3f..e2bf774 100644
--- a/pkg/alg/lru/cache_test.go
+++ b/pkg/alg/lru/cache_test.go
@@ -1,4 +1,3 @@
-// FRD: specs/frds/FRD-20260302-generic-lru-cache.md.
 package lru_test
 
 import (
diff --git a/pkg/alg/mapx/maps_test.go b/pkg/alg/mapx/maps_test.go
index 47f1969..27ddb5d 100644
--- a/pkg/alg/mapx/maps_test.go
+++ b/pkg/alg/mapx/maps_test.go
@@ -148,7 +148,6 @@ func TestMergeAdditive(t *testing.T) {
 	})
 }
 
-// FRD: specs/frds/FRD-20260306-merge-nested-additive.md.
 func TestMergeNestedAdditive(t *testing.T) {
 	t.Parallel()
 
@@ -211,8 +210,6 @@ func TestMergeNestedAdditive(t *testing.T) {
 	})
 }
 
-// FRD: specs/frds/FRD-20260310-estimate-map-size.md.
-
 func TestEstimateMapSize(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/alg/mapx/slices_test.go b/pkg/alg/mapx/slices_test.go
index 5e61d27..10bf892 100644
--- a/pkg/alg/mapx/slices_test.go
+++ b/pkg/alg/mapx/slices_test.go
@@ -6,8 +6,6 @@ import (
 	"github.com/stretchr/testify/assert"
 )
 
-// FRD: specs/frds/FRD-20260303-sort-and-limit.md.
-
 func TestSortAndLimit(t *testing.T) {
 	t.Parallel()
 
@@ -67,8 +65,6 @@ func TestSortAndLimit(t *testing.T) {
 	})
 }
 
-// FRD: specs/frds/FRD-20260303-build-lookup-set.md.
-
 func TestBuildLookupSet(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/alg/pairs_test.go b/pkg/alg/pairs_test.go
index f4f7673..361a229 100644
--- a/pkg/alg/pairs_test.go
+++ b/pkg/alg/pairs_test.go
@@ -1,7 +1,5 @@
 package alg_test
 
-// FRD: specs/frds/FRD-20260302-chunk-pairs.md.
-
 import (
 	"testing"
 
diff --git a/pkg/alg/stats/stats_test.go b/pkg/alg/stats/stats_test.go
index 8893cac..0d7ef09 100644
--- a/pkg/alg/stats/stats_test.go
+++ b/pkg/alg/stats/stats_test.go
@@ -193,8 +193,6 @@ func TestMeanStdDev(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260303-to-percent.md.
-
 func TestToPercent(t *testing.T) {
 	t.Parallel()
 
@@ -251,8 +249,6 @@ func TestMean(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260310-exceeds-threshold.md.
-
 func TestExceedsThreshold(t *testing.T) {
 	t.Parallel()
 
@@ -286,8 +282,6 @@ func TestExceedsThreshold(t *testing.T) {
 	}
 }
 
-// FRD: specs/frds/FRD-20260303-distribution.md.
-
 func TestDistribution(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/alg/tree_test.go b/pkg/alg/tree_test.go
index 116f4d9..394ce57 100644
--- a/pkg/alg/tree_test.go
+++ b/pkg/alg/tree_test.go
@@ -1,5 +1,3 @@
-// FRD: specs/frds/FRD-20260310-traverse-tree.md.
-
 package alg
 
 import (
diff --git a/pkg/gitlib/cgo_bridge.go b/pkg/gitlib/cgo_bridge.go
index dde36d5..5236763 100644
--- a/pkg/gitlib/cgo_bridge.go
+++ b/pkg/gitlib/cgo_bridge.go
@@ -61,6 +61,29 @@ func NewCGOBridge(repo *Repository) *CGOBridge {
 	return &CGOBridge{repo: repo}
 }
 
+// marshalPathspec converts a Go []string into a C **char array suitable for
+// passing to cf_tree_diff_v2. Returns a free function that must be deferred
+// by the caller to release the C memory. A nil/empty pathspec returns
+// (nil, noop-free).
+func marshalPathspec(pathspec []string) (**C.char, func()) {
+	if len(pathspec) == 0 {
+		return nil, func() {}
+	}
+
+	cStrings := make([]*C.char, len(pathspec))
+	for i, s := range pathspec {
+		cStrings[i] = C.CString(s)
+	}
+
+	free := func() {
+		for _, cs := range cStrings {
+			C.free(unsafe.Pointer(cs))
+		}
+	}
+
+	return (**C.char)(unsafe.Pointer(&cStrings[0])), free
+}
+
 // getRepoPtr extracts the underlying C pointer from git2go.Repository.
 // Uses reflection to access the unexported 'ptr' field.
 func (b *CGOBridge) getRepoPtr() unsafe.Pointer {
@@ -289,9 +312,14 @@ func (b *CGOBridge) BatchLoadBlobs(hashes []Hash) []BlobResult {
 	return results
 }
 
-// TreeDiff computes the difference between two trees in a single batch CGO call.
-// Skips libgit2 diff when both tree OIDs are equal (e.g. metadata-only commits).
-func (b *CGOBridge) TreeDiff(oldTreeHash, newTreeHash Hash) (Changes, error) {
+// TreeDiffWithPathspec computes the difference between two trees in a single
+// batch CGO call. pathspec is a list of fnmatch-style globs (e.g. "*.go",
+// "Dockerfile") applied as a libgit2 pre-filter; when empty or nil, libgit2
+// returns the full diff. Skips libgit2 diff when both tree OIDs are equal
+// (e.g. metadata-only commits).
+func (b *CGOBridge) TreeDiffWithPathspec(
+	oldTreeHash, newTreeHash Hash, pathspec []string,
+) (Changes, error) {
 	if !oldTreeHash.IsZero() && !newTreeHash.IsZero() && oldTreeHash == newTreeHash {
 		return make(Changes, 0), nil
 	}
@@ -318,6 +346,9 @@ func (b *CGOBridge) TreeDiff(oldTreeHash, newTreeHash Hash) (Changes, error) {
 		return nil, ErrRepositoryPointer
 	}
 
+	cPathspec, freePathspec := marshalPathspec(pathspec)
+	defer freePathspec()
+
 	var cResult C.cf_tree_diff_result
 
 	// Ensure result is clean
@@ -325,10 +356,12 @@ func (b *CGOBridge) TreeDiff(oldTreeHash, newTreeHash Hash) (Changes, error) {
 	cResult.count = 0
 
 	// Call C function
-	ret := C.cf_tree_diff(
+	ret := C.cf_tree_diff_v2(
 		(*C.git_repository)(repoPtr),
 		pOldOid,
 		pNewOid,
+		cPathspec,
+		C.size_t(len(pathspec)),
 		&cResult,
 	)
 
diff --git a/pkg/gitlib/clib/codefang_git.h b/pkg/gitlib/clib/codefang_git.h
index c5e28b6..a68d2da 100644
--- a/pkg/gitlib/clib/codefang_git.h
+++ b/pkg/gitlib/clib/codefang_git.h
@@ -140,6 +140,21 @@ int cf_tree_diff(
     cf_tree_diff_result* result
 );
 
+/*
+ * Compute diff between two trees with an optional pathspec pre-filter.
+ * pathspec points to an array of pathspec_n C strings; when pathspec_n
+ * is 0 or pathspec is NULL the call is equivalent to cf_tree_diff.
+ * Each pathspec entry is an fnmatch-style glob (e.g. "*.go", "Dockerfile").
+ */
+int cf_tree_diff_v2(
+    git_repository* repo,
+    git_oid* old_tree_oid,
+    git_oid* new_tree_oid,
+    const char** pathspec,
+    size_t pathspec_n,
+    cf_tree_diff_result* result
+);
+
 /*
  * Free tree diff result.
  */
diff --git a/pkg/gitlib/clib/diff_ops.c b/pkg/gitlib/clib/diff_ops.c
index 70e2793..c1d8321 100644
--- a/pkg/gitlib/clib/diff_ops.c
+++ b/pkg/gitlib/clib/diff_ops.c
@@ -622,6 +622,21 @@ int cf_tree_diff(
     git_oid* old_tree_oid,
     git_oid* new_tree_oid,
     cf_tree_diff_result* result
+) {
+    return cf_tree_diff_v2(repo, old_tree_oid, new_tree_oid, NULL, 0, result);
+}
+
+/*
+ * Compute diff between two trees with an optional libgit2 pathspec
+ * pre-filter. See header for semantics.
+ */
+int cf_tree_diff_v2(
+    git_repository* repo,
+    git_oid* old_tree_oid,
+    git_oid* new_tree_oid,
+    const char** pathspec,
+    size_t pathspec_n,
+    cf_tree_diff_result* result
 ) {
     git_tree* old_tree = NULL;
     git_tree* new_tree = NULL;
@@ -650,6 +665,13 @@ int cf_tree_diff(
 
     /* Compute diff */
     git_diff_options opts = GIT_DIFF_OPTIONS_INIT;
+    if (pathspec != NULL && pathspec_n > 0) {
+        /* git_strarray.strings is declared `char**` but libgit2 only
+         * reads from it during the diff call, so the const-stripping
+         * cast is safe. The Go bridge owns all backing memory. */
+        opts.pathspec.strings = (char**)pathspec;
+        opts.pathspec.count   = pathspec_n;
+    }
     if (git_diff_tree_to_tree(&diff, repo, old_tree, new_tree, &opts) != 0) {
         ret = CF_ERR_DIFF;
         goto cleanup;
diff --git a/pkg/gitlib/worker.go b/pkg/gitlib/worker.go
index 224fc91..c23950b 100644
--- a/pkg/gitlib/worker.go
+++ b/pkg/gitlib/worker.go
@@ -33,7 +33,11 @@ type TreeDiffRequest struct {
 	PreviousTree       *Tree // Optimization: Use existing tree if on same worker/repo.
 	PreviousCommitHash Hash  // Fallback: Lookup previous tree by hash (safe for pool workers).
 	CommitHash         Hash  // Hash of the commit to process.
-	Response           chan<- TreeDiffResponse
+	// Pathspec restricts the diff to files matching any of the given
+	// fnmatch-style globs (e.g. []string{"*.go", "Dockerfile"}). An empty
+	// or nil slice disables path-based pre-filtering.
+	Pathspec []string
+	Response chan<- TreeDiffResponse
 }
 
 // TreeDiffResponse is the response for a TreeDiffRequest.
@@ -185,7 +189,7 @@ func (w *Worker) handle(req WorkerRequest) {
 			prevTreeHash := prevCommit.TreeHash()
 			prevCommit.Free()
 
-			changes, err = w.bridge.TreeDiff(prevTreeHash, currTreeHash)
+			changes, err = w.bridge.TreeDiffWithPathspec(prevTreeHash, currTreeHash, typedReq.Pathspec)
 		default:
 			changes, err = InitialTreeChanges(ctx, w.repo, commitTree)
 		}
diff --git a/pkg/gitlib/worker_test.go b/pkg/gitlib/worker_test.go
index 8014960..99e5dd7 100644
--- a/pkg/gitlib/worker_test.go
+++ b/pkg/gitlib/worker_test.go
@@ -4,6 +4,7 @@ import (
 	"context"
 	"testing"
 
+	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 
 	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
@@ -299,6 +300,54 @@ func TestCGOBridge_BatchDiffBlobsInvalidHash(t *testing.T) {
 	require.Equal(t, gitlib.ErrDiffLookup, results[0].Error)
 }
 
+// TestCGOBridge_TreeDiffWithPathspec_FiltersByGlob verifies that passing
+// a pathspec to the cgo bridge drops non-matching files at the libgit2
+// level — before they cross the cgo boundary.
+func TestCGOBridge_TreeDiffWithPathspec_FiltersByGlob(t *testing.T) {
+	t.Parallel()
+
+	tr := newTestRepo(t)
+	defer tr.cleanup()
+
+	tr.createFile("a.go", "package a")
+	tr.createFile("b.py", "x = 1")
+	tr.createFile("c.js", "var y = 2;")
+	firstHash := tr.commit("first")
+
+	tr.createFile("a.go", "package a\n// edit")
+	tr.createFile("b.py", "x = 2")
+	tr.createFile("c.js", "var y = 3;")
+	secondHash := tr.commit("second")
+
+	repo, err := gitlib.OpenRepository(tr.path)
+	require.NoError(t, err)
+
+	defer repo.Free()
+
+	firstCommit, err := repo.LookupCommit(context.Background(), firstHash)
+	require.NoError(t, err)
+
+	defer firstCommit.Free()
+
+	secondCommit, err := repo.LookupCommit(context.Background(), secondHash)
+	require.NoError(t, err)
+
+	defer secondCommit.Free()
+
+	bridge := gitlib.NewCGOBridge(repo)
+
+	baseline, err := bridge.TreeDiffWithPathspec(firstCommit.TreeHash(), secondCommit.TreeHash(), nil)
+	require.NoError(t, err)
+	require.Len(t, baseline, 3, "baseline must see all 3 modified files")
+
+	filtered, err := bridge.TreeDiffWithPathspec(
+		firstCommit.TreeHash(), secondCommit.TreeHash(), []string{"*.go"},
+	)
+	require.NoError(t, err)
+	require.Len(t, filtered, 1, "pathspec '*.go' must restrict to Go files")
+	assert.Equal(t, "a.go", filtered[0].To.Name)
+}
+
 // TestCGOBridge_TreeDiffSameHash verifies TreeDiff returns empty when both tree hashes are equal (skip path).
 func TestCGOBridge_TreeDiffSameHash(t *testing.T) {
 	t.Parallel()
@@ -323,7 +372,7 @@ func TestCGOBridge_TreeDiffSameHash(t *testing.T) {
 	require.False(t, treeHash.IsZero())
 
 	bridge := gitlib.NewCGOBridge(repo)
-	changes, err := bridge.TreeDiff(treeHash, treeHash)
+	changes, err := bridge.TreeDiffWithPathspec(treeHash, treeHash, nil)
 	require.NoError(t, err)
 	require.Empty(t, changes)
 }
diff --git a/pkg/iosafety/iosafety_test.go b/pkg/iosafety/iosafety_test.go
index 3cc6a7d..d584974 100644
--- a/pkg/iosafety/iosafety_test.go
+++ b/pkg/iosafety/iosafety_test.go
@@ -9,8 +9,6 @@ import (
 	"github.com/stretchr/testify/require"
 )
 
-// FRD: specs/frds/FRD-20260310-iosafety-promote.md.
-
 func TestResolvePath_EmptyPath(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/meminfo/rss_test.go b/pkg/meminfo/rss_test.go
index c290e0f..473f0c5 100644
--- a/pkg/meminfo/rss_test.go
+++ b/pkg/meminfo/rss_test.go
@@ -1,7 +1,5 @@
 package meminfo
 
-// FRD: specs/frds/FRD-20260312-static-rss-logging.md.
-
 import (
 	"runtime"
 	"testing"
diff --git a/pkg/metrics/metrics_test.go b/pkg/metrics/metrics_test.go
index 087c4df..59a98a5 100644
--- a/pkg/metrics/metrics_test.go
+++ b/pkg/metrics/metrics_test.go
@@ -186,8 +186,6 @@ func TestTimeSeriesPoint_Fields(t *testing.T) {
 	assert.InDelta(t, float64(testInputValue), point.Value, 0.001)
 }
 
-// FRD: specs/frds/FRD-20260303-risk-priority.md.
-
 func TestRiskPriority_AllLevels(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/pipeline/batcher_test.go b/pkg/pipeline/batcher_test.go
index 027c8fc..059268c 100644
--- a/pkg/pipeline/batcher_test.go
+++ b/pkg/pipeline/batcher_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260302-composable-pipeline-patterns.md.
-
 import (
 	"testing"
 
diff --git a/pkg/pipeline/dispatch_test.go b/pkg/pipeline/dispatch_test.go
index 258b569..2428229 100644
--- a/pkg/pipeline/dispatch_test.go
+++ b/pkg/pipeline/dispatch_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260302-composable-pipeline-patterns.md.
-
 import (
 	"context"
 	"errors"
diff --git a/pkg/pipeline/drain_test.go b/pkg/pipeline/drain_test.go
index 1c3a55a..1ebe05a 100644
--- a/pkg/pipeline/drain_test.go
+++ b/pkg/pipeline/drain_test.go
@@ -7,8 +7,6 @@ import (
 	"github.com/stretchr/testify/require"
 )
 
-// FRD: specs/frds/FRD-20260310-signal-on-drain.md.
-
 const forwardTestItems = 3
 
 func TestSignalOnDrain_ForwardsAllItems(t *testing.T) {
diff --git a/pkg/pipeline/fetcher_test.go b/pkg/pipeline/fetcher_test.go
index aa0ec6f..574ff43 100644
--- a/pkg/pipeline/fetcher_test.go
+++ b/pkg/pipeline/fetcher_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260302-composable-pipeline-patterns.md.
-
 import (
 	"context"
 	"errors"
diff --git a/pkg/pipeline/phase_test.go b/pkg/pipeline/phase_test.go
index e43b63f..1eb6baa 100644
--- a/pkg/pipeline/phase_test.go
+++ b/pkg/pipeline/phase_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260302-composable-pipeline-patterns.md.
-
 import (
 	"context"
 	"errors"
diff --git a/pkg/pipeline/runpc_test.go b/pkg/pipeline/runpc_test.go
index 2f95c57..0fedbc9 100644
--- a/pkg/pipeline/runpc_test.go
+++ b/pkg/pipeline/runpc_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260302-composable-pipeline-patterns.md.
-
 import (
 	"context"
 	"testing"
diff --git a/pkg/pipeline/shared_response_test.go b/pkg/pipeline/shared_response_test.go
index bc94cf9..e1ec5ab 100644
--- a/pkg/pipeline/shared_response_test.go
+++ b/pkg/pipeline/shared_response_test.go
@@ -1,7 +1,5 @@
 package pipeline_test
 
-// FRD: specs/frds/FRD-20260303-shared-response-move.md.
-
 import (
 	"context"
 	"errors"
diff --git a/pkg/pipeline/workerpool_test.go b/pkg/pipeline/workerpool_test.go
index 6a14cae..8eb1f9b 100644
--- a/pkg/pipeline/workerpool_test.go
+++ b/pkg/pipeline/workerpool_test.go
@@ -11,8 +11,6 @@ import (
 	"github.com/stretchr/testify/require"
 )
 
-// FRD: specs/frds/FRD-20260310-worker-pool.md.
-
 var errWorker = errors.New("worker failed")
 
 func TestWorkerPool_EmptyItems(t *testing.T) {
@@ -235,8 +233,6 @@ func TestWorkerPool_ErrorCancelsContext(t *testing.T) {
 	assert.ErrorIs(t, err, errWorker)
 }
 
-// FRD: specs/frds/FRD-20260311-streaming-file-discovery.md.
-
 func TestWorkerPool_RunChan_EmptyChannel(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/safeconv/generic_test.go b/pkg/safeconv/generic_test.go
index 1d28a38..f4406c2 100644
--- a/pkg/safeconv/generic_test.go
+++ b/pkg/safeconv/generic_test.go
@@ -1,5 +1,3 @@
-// FRD: specs/frds/FRD-20260310-generic-safeconv.md.
-
 package safeconv
 
 import (
diff --git a/pkg/safeconv/safeconv_test.go b/pkg/safeconv/safeconv_test.go
index d52a7c9..2af1a0b 100644
--- a/pkg/safeconv/safeconv_test.go
+++ b/pkg/safeconv/safeconv_test.go
@@ -1,7 +1,5 @@
 package safeconv
 
-// FRD: specs/frds/FRD-20260302-safeconv-expansion.md.
-
 import (
 	"math"
 	"testing"
diff --git a/pkg/sigutil/guard_test.go b/pkg/sigutil/guard_test.go
index a216942..3ae057a 100644
--- a/pkg/sigutil/guard_test.go
+++ b/pkg/sigutil/guard_test.go
@@ -1,7 +1,5 @@
 package sigutil_test
 
-// FRD: specs/frds/FRD-20260302-signal-cleanup-guard.md.
-
 import (
 	"io"
 	"log/slog"
diff --git a/pkg/textutil/textutil_test.go b/pkg/textutil/textutil_test.go
index 01ce985..bd7b474 100644
--- a/pkg/textutil/textutil_test.go
+++ b/pkg/textutil/textutil_test.go
@@ -9,8 +9,6 @@ import (
 	"github.com/stretchr/testify/require"
 )
 
-// FRD: specs/frds/FRD-20260310-writejson-helper.md.
-
 func TestWriteJSON_PrettyOutput(t *testing.T) {
 	t.Parallel()
 
diff --git a/pkg/uast/parsefile_test.go b/pkg/uast/parsefile_test.go
index 40f47d6..2cdf189 100644
--- a/pkg/uast/parsefile_test.go
+++ b/pkg/uast/parsefile_test.go
@@ -1,5 +1,3 @@
-// FRD: specs/frds/FRD-20260310-parse-source-file.md.
-
 package uast
 
 import (
diff --git a/pkg/uast/parser_bench_test.go b/pkg/uast/parser_bench_test.go
index 7f919f7..f7a4e87 100644
--- a/pkg/uast/parser_bench_test.go
+++ b/pkg/uast/parser_bench_test.go
@@ -1,7 +1,5 @@
 package uast_test
 
-// FRD: specs/frds/FRD-20260311-eager-tree-release.md.
-
 import (
 	"context"
 	"fmt"
diff --git a/pkg/uast/parser_determinism_test.go b/pkg/uast/parser_determinism_test.go
new file mode 100644
index 0000000..546a303
--- /dev/null
+++ b/pkg/uast/parser_determinism_test.go
@@ -0,0 +1,142 @@
+package uast
+
+import (
+	"context"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/pkg/uast/pkg/node"
+)
+
+// TestParser_DeterministicAcrossParses guards against state leaking through
+// the parseContext [sync.Pool] — most notably the shared ctx.batchChildren
+// backing array, which previously caused recursive processChildrenBatch
+// calls to overwrite outer-loop entries the parent had not yet read.
+//
+// The input is a Go file whose root and inner blocks each have well over
+// the cursorThreshold of named children, exercising both the batch and
+// recursive paths. Parsing the same content with the same *Parser must
+// produce structurally identical trees on every call.
+func TestParser_DeterministicAcrossParses(t *testing.T) {
+	t.Parallel()
+
+	src := []byte(`package main
+
+import (
+	"context"
+	"errors"
+	"fmt"
+	"io"
+	"os"
+	"strings"
+	"sync"
+	"time"
+)
+
+func a() {}
+func b() {}
+func c() {}
+func d() {}
+func e() {}
+func f() {}
+func g() {}
+func h() {}
+func i() {}
+func j() {}
+
+func work(ctx context.Context, w io.Writer) error {
+	if ctx == nil {
+		return errors.New("nil ctx")
+	}
+
+	var mu sync.Mutex
+	mu.Lock()
+	defer mu.Unlock()
+
+	parts := []string{"a", "b", "c", "d", "e", "f", "g", "h", "i", "j"}
+	out := strings.Join(parts, ",")
+
+	for idx, p := range parts {
+		if p == "" {
+			continue
+		}
+		fmt.Fprintf(w, "%d:%s\n", idx, p)
+	}
+
+	now := time.Now()
+	if _, err := fmt.Fprintln(w, out, now); err != nil {
+		return err
+	}
+	if _, err := fmt.Fprintln(os.Stderr, "done"); err != nil {
+		return err
+	}
+	return nil
+}
+`)
+
+	parser, err := NewParser()
+	require.NoError(t, err)
+
+	const runs = 8
+
+	first, err := parser.Parse(context.Background(), "main.go", src)
+	require.NoError(t, err)
+	require.NotNil(t, first)
+
+	wantNodes := countAllNodes(first)
+	wantFuncs := countFunctionNodes(first)
+	node.ReleaseTree(first)
+
+	require.Positive(t, wantNodes, "baseline tree must be non-empty")
+	require.GreaterOrEqual(t, wantFuncs, 11, "expected at least 11 functions in the fixture")
+
+	for run := 2; run <= runs; run++ {
+		tree, parseErr := parser.Parse(context.Background(), "main.go", src)
+		require.NoErrorf(t, parseErr, "parse run %d failed", run)
+		require.NotNil(t, tree)
+
+		gotNodes := countAllNodes(tree)
+		gotFuncs := countFunctionNodes(tree)
+		node.ReleaseTree(tree)
+
+		assert.Equalf(t, wantNodes, gotNodes,
+			"node count drift on run %d: want %d, got %d (parseContext buffer corruption?)",
+			run, wantNodes, gotNodes)
+		assert.Equalf(t, wantFuncs, gotFuncs,
+			"function count drift on run %d: want %d, got %d (parseContext buffer corruption?)",
+			run, wantFuncs, gotFuncs)
+	}
+}
+
+func countAllNodes(n *node.Node) int {
+	if n == nil {
+		return 0
+	}
+
+	total := 1
+	for _, child := range n.Children {
+		total += countAllNodes(child)
+	}
+
+	return total
+}
+
+func countFunctionNodes(n *node.Node) int {
+	if n == nil {
+		return 0
+	}
+
+	count := 0
+	if n.HasAnyType(node.UASTFunction, node.UASTMethod) ||
+		n.HasAllRoles(node.RoleFunction, node.RoleDeclaration) {
+		count = 1
+	}
+
+	for _, child := range n.Children {
+		count += countFunctionNodes(child)
+	}
+
+	return count
+}
diff --git a/pkg/uast/parser_dsl.go b/pkg/uast/parser_dsl.go
index c127344..f63dcf8 100644
--- a/pkg/uast/parser_dsl.go
+++ b/pkg/uast/parser_dsl.go
@@ -506,8 +506,16 @@ func (ctx *parseContext) processChildrenBatch(
 		return ctx.processChildrenCursor(root, mappingRule, children)
 	}
 
+	// Snapshot child nodes before recursing. toCanonicalNode may indirectly
+	// re-enter processChildrenBatch, which calls ensureBatchChildren and
+	// reslices ctx.batchChildren over the same backing array — overwriting
+	// the entries we have not yet read.
+	siblings := make([]sitter.Node, written)
 	for idx := range written {
-		child := batchChildToNode(batchChildren[idx])
+		siblings[idx] = batchChildToNode(batchChildren[idx])
+	}
+
+	for _, child := range siblings {
 		if child.IsNull() || !child.IsNamed() {
 			return ctx.processChildrenCursor(root, mappingRule, children)
 		}
diff --git a/pkg/units/units_test.go b/pkg/units/units_test.go
index 05e1618..368d805 100644
--- a/pkg/units/units_test.go
+++ b/pkg/units/units_test.go
@@ -2,8 +2,6 @@ package units
 
 import "testing"
 
-// FRD: specs/frds/FRD-20260302-size-unit-constants.md.
-
 // Expected binary size multiplier values.
 const (
 	expectedKiB = 1024
diff --git a/scripts/bench-hibernation/main.go b/scripts/bench-hibernation/main.go
index 071ae5c..b442925 100644
--- a/scripts/bench-hibernation/main.go
+++ b/scripts/bench-hibernation/main.go
@@ -22,8 +22,8 @@ import (
 	filehistory "github.com/Sumatoshi-tech/codefang/internal/analyzers/file_history"
 	"github.com/Sumatoshi-tech/codefang/internal/analyzers/plumbing"
 	"github.com/Sumatoshi-tech/codefang/internal/framework"
-	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 	"github.com/Sumatoshi-tech/codefang/internal/streaming"
+	"github.com/Sumatoshi-tech/codefang/pkg/gitlib"
 )
 
 func main() {
diff --git a/scripts/orphan-packages.sh b/scripts/orphan-packages.sh
new file mode 100755
index 0000000..e2977c2
--- /dev/null
+++ b/scripts/orphan-packages.sh
@@ -0,0 +1,103 @@
+#!/bin/bash
+# orphan-packages.sh - Detect Go packages that exist on disk but are not
+# imported by any other package.
+#
+# A package is orphan when no other package in the module imports it —
+# even if it has its own tests. Self-contained test-only packages that
+# nothing depends on are still dead weight in the repo.
+#
+# Entry points (main packages) are excluded from orphan detection since
+# they are invoked by the Go toolchain directly.
+#
+# These packages are invisible to deadcode analysis.
+# Common cause: speculatively written code with no importer yet.
+#
+# Whitelist: .orphan-packages-whitelist (one package path per line, # comments ok)
+#
+# Usage: ./scripts/orphan-packages.sh [package-patterns...]
+#   Default patterns: ./cmd/... ./pkg/... ./internal/...
+
+set -e
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+ROOT_DIR="$(dirname "$SCRIPT_DIR")"
+WHITELIST_FILE="$ROOT_DIR/.orphan-packages-whitelist"
+
+cd "$ROOT_DIR"
+
+PATTERNS=("$@")
+if [ ${#PATTERNS[@]} -eq 0 ]; then
+    PATTERNS=("./cmd/..." "./pkg/..." "./internal/...")
+fi
+
+MOD=$(go list -m)
+
+# All packages on disk within the given patterns.
+ALL_PKGS=$(go list "${PATTERNS[@]}" 2>/dev/null | sort -u)
+
+# Collect every intra-module import from OTHER packages.
+# Production imports, test imports, and external test imports all count
+# as evidence that the target package is needed.
+IMPORTED_BY_OTHERS=$(
+    go list -json "${PATTERNS[@]}" 2>/dev/null \
+    | jq -r --arg mod "$MOD" '
+        .ImportPath as $self |
+        ((.Imports // []) + (.TestImports // []) + (.XTestImports // []))[] |
+        select(startswith($mod + "/")) |
+        select(. != $self)
+    ' \
+    | sort -u
+)
+
+# Main packages are entry points — they can't be orphans.
+MAIN_PKGS=$(
+    go list -json "${PATTERNS[@]}" 2>/dev/null \
+    | jq -r 'select(.Name == "main") | .ImportPath' \
+    | sort -u
+)
+
+# A package is NOT orphan if: it is imported by another package, OR it is a main package.
+NOT_ORPHAN=$(printf '%s\n%s\n' "$IMPORTED_BY_OTHERS" "$MAIN_PKGS" | sort -u)
+
+ORPHANS=$(comm -23 <(echo "$ALL_PKGS") <(echo "$NOT_ORPHAN"))
+
+# Apply whitelist if present.
+if [ -f "$WHITELIST_FILE" ]; then
+    WHITELIST=$(grep -v '^\s*#' "$WHITELIST_FILE" | grep -v '^\s*$' | sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | sed "s|^|${MOD}/|" | sort -u)
+    WHITELIST_COUNT=$(echo "$WHITELIST" | grep -c . || true)
+    ORPHANS_FILTERED=$(comm -23 <(echo "$ORPHANS") <(echo "$WHITELIST"))
+    FILTERED_COUNT=$(( $(echo "$ORPHANS" | grep -c . || true) - $(echo "$ORPHANS_FILTERED" | grep -c . || true) ))
+    ORPHANS="$ORPHANS_FILTERED"
+else
+    WHITELIST_COUNT=0
+    FILTERED_COUNT=0
+fi
+
+if [ -z "$ORPHANS" ]; then
+    if [ "$FILTERED_COUNT" -gt 0 ]; then
+        echo "✓ No orphan packages found (excluding $FILTERED_COUNT/$WHITELIST_COUNT whitelisted)"
+    else
+        echo "✓ No orphan packages found"
+    fi
+    exit 0
+fi
+
+echo "Orphan packages (not imported by any other package):"
+echo ""
+
+COUNT=0
+while IFS= read -r pkg; do
+    [ -z "$pkg" ] && continue
+    rel="${pkg#${MOD}/}"
+    echo "  $rel"
+    COUNT=$((COUNT + 1))
+done <<< "$ORPHANS"
+
+echo ""
+if [ "$FILTERED_COUNT" -gt 0 ]; then
+    echo "$COUNT orphan package(s) found ($FILTERED_COUNT/$WHITELIST_COUNT whitelisted excluded)."
+else
+    echo "$COUNT orphan package(s) found."
+fi
+echo "Either import them or delete them."
+exit 1
diff --git a/site/analyzers/complexity.md b/site/analyzers/complexity.md
index dde3737..c6e145b 100644
--- a/site/analyzers/complexity.md
+++ b/site/analyzers/complexity.md
@@ -71,36 +71,62 @@ The complexity analyzer uses the UAST directly and has no analyzer-specific conf
 
     ```json
     {
-      "complexity": {
-        "functions": [
-          {
-            "name": "processFile",
-            "file": "main.go",
-            "line": 42,
-            "cyclomatic": 8,
-            "cognitive": 12,
-            "nesting_depth": 3
-          },
-          {
-            "name": "validate",
-            "file": "main.go",
-            "line": 105,
-            "cyclomatic": 15,
-            "cognitive": 22,
-            "nesting_depth": 5
-          }
-        ],
-        "summary": {
-          "total_functions": 2,
-          "avg_cyclomatic": 11.5,
-          "avg_cognitive": 17.0,
-          "max_cyclomatic": 15,
-          "max_nesting_depth": 5
+      "function_complexity": [
+        {
+          "name": "processFile",
+          "source_file": "cmd/server/main.go",
+          "language": "go",
+          "directory": "cmd/server",
+          "cyclomatic_complexity": 8,
+          "cognitive_complexity": 12,
+          "nesting_depth": 3,
+          "lines_of_code": 45,
+          "complexity_density": 0.178,
+          "risk_level": "LOW"
+        },
+        {
+          "name": "validate",
+          "source_file": "cmd/server/main.go",
+          "language": "go",
+          "directory": "cmd/server",
+          "cyclomatic_complexity": 15,
+          "cognitive_complexity": 22,
+          "nesting_depth": 5,
+          "lines_of_code": 80,
+          "complexity_density": 0.188,
+          "risk_level": "MEDIUM"
         }
+      ],
+      "high_risk_functions": [
+        {
+          "name": "validate",
+          "source_file": "cmd/server/main.go",
+          "language": "go",
+          "directory": "cmd/server",
+          "cyclomatic_complexity": 15,
+          "cognitive_complexity": 22,
+          "risk_level": "MEDIUM",
+          "issues": ["High cyclomatic complexity", "Deep nesting"]
+        }
+      ],
+      "distribution": {
+        "simple": 180,
+        "moderate": 25,
+        "complex": 7
+      },
+      "aggregate": {
+        "total_functions": 212,
+        "average_complexity": 3.2,
+        "max_complexity": 15,
+        "health_score": 78.5,
+        "message": "Fair complexity - some functions could be simplified"
       }
     }
     ```
 
+    Each function record includes `source_file`, `language`, and `directory`
+    for file-level joins and DWH aggregation.
+
 === "Text"
 
     ```
diff --git a/site/analyzers/couples.md b/site/analyzers/couples.md
index 4ae7804..dfb5234 100644
--- a/site/analyzers/couples.md
+++ b/site/analyzers/couples.md
@@ -110,7 +110,9 @@ The couples analyzer provides a `ReportSection` for use in combined reports:
       "developer_coupling": [
         {
           "developer1": "alice",
+          "developer1_email": "alice@example.com",
           "developer2": "bob",
+          "developer2_email": "bob@example.com",
           "shared_file_changes": 234,
           "coupling_strength": 0.65
         }
diff --git a/site/analyzers/developers.md b/site/analyzers/developers.md
index ada1262..c527510 100644
--- a/site/analyzers/developers.md
+++ b/site/analyzers/developers.md
@@ -115,15 +115,16 @@ history:
         {
           "id": 0,
           "name": "alice",
+          "email": "alice@example.com",
           "commits": 342,
           "lines_added": 28500,
           "lines_removed": 12300,
           "lines_changed": 8400,
           "net_lines": 16200,
-          "languages": {
-            "Go": {"added": 22000, "removed": 9800, "changed": 6200},
-            "Python": {"added": 6500, "removed": 2500, "changed": 2200}
-          },
+          "languages": [
+            {"language": "Go", "added": 22000, "removed": 9800, "changed": 6200},
+            {"language": "Python", "added": 6500, "removed": 2500, "changed": 2200}
+          ],
           "first_tick": 0,
           "last_tick": 120,
           "active_ticks": 85
@@ -135,12 +136,6 @@ history:
           "total_lines": 45000,
           "total_contribution": 67800,
           "contributors": {"0": 54600, "1": 13200}
-        },
-        {
-          "name": "Python",
-          "total_lines": 12000,
-          "total_contribution": 16700,
-          "contributors": {"0": 11200, "1": 5500}
         }
       ],
       "busfactor": [
@@ -148,24 +143,41 @@ history:
           "language": "Python",
           "bus_factor": 1,
           "total_contributors": 2,
+          "primary_dev_id": 0,
           "primary_dev_name": "alice",
+          "primary_dev_email": "alice@example.com",
           "primary_percentage": 67.1,
+          "secondary_dev_id": 1,
           "secondary_dev_name": "bob",
+          "secondary_dev_email": "bob@example.com",
           "secondary_percentage": 32.9,
           "risk_level": "MEDIUM"
         }
       ],
       "activity": [
-        {"tick": 0, "total_commits": 5, "by_developer": {"0": 3, "1": 2}},
-        {"tick": 1, "total_commits": 8, "by_developer": {"0": 5, "1": 3}}
+        {
+          "tick": 0,
+          "start_time": "2024-01-15T10:30:00Z",
+          "end_time": "2024-01-16T08:45:00Z",
+          "total_commits": 5,
+          "by_developer": [
+            {"dev_id": 0, "commits": 3},
+            {"dev_id": 1, "commits": 2}
+          ]
+        }
       ],
       "churn": [
-        {"tick": 0, "lines_added": 450, "lines_removed": 120, "net_change": 330}
+        {
+          "tick": 0,
+          "start_time": "2024-01-15T10:30:00Z",
+          "end_time": "2024-01-16T08:45:00Z",
+          "lines_added": 450,
+          "lines_removed": 120,
+          "net_change": 330
+        }
       ],
       "aggregate": {
         "total_commits": 850,
-        "total_lines_added": 95000,
-        "total_lines_removed": 42000,
         "total_developers": 5,
         "active_developers": 3,
         "analysis_period_ticks": 120,
@@ -175,6 +187,15 @@ history:
     }
     ```
 
+    **Key fields for analytics:**
+
+    - `developers[].email` — split from previously pipe-delimited name
+    - `developers[].languages` — flattened from map to sorted array
+    - `activity[].by_developer` — flattened from `map[int]int` to `[{dev_id, commits}]` array
+    - `activity[].start_time` / `end_time` — RFC 3339 tick boundaries
+    - `churn[].start_time` / `end_time` — RFC 3339 tick boundaries
+    - `busfactor[].primary_dev_email` / `secondary_dev_email` — split identity fields
+
 === "YAML"
 
     ```yaml
diff --git a/site/analyzers/file-history.md b/site/analyzers/file-history.md
index b2872f1..d5c3a1b 100644
--- a/site/analyzers/file-history.md
+++ b/site/analyzers/file-history.md
@@ -89,12 +89,12 @@ The file history analyzer has no additional configuration options.
       "file_contributors": [
         {
           "path": "pkg/core/engine.go",
-          "contributors": {
-            "0": {"added": 2200, "removed": 900, "changed": 600},
-            "1": {"added": 800, "removed": 700, "changed": 250},
-            "2": {"added": 150, "removed": 150, "changed": 80},
-            "3": {"added": 50, "removed": 50, "changed": 20}
-          },
+          "contributors": [
+            {"dev_id": 0, "added": 2200, "removed": 900, "changed": 600},
+            {"dev_id": 1, "added": 800, "removed": 700, "changed": 250},
+            {"dev_id": 2, "added": 150, "removed": 150, "changed": 80},
+            {"dev_id": 3, "added": 50, "removed": 50, "changed": 20}
+          ],
           "top_contributor_id": 0,
           "top_contributor_lines": 2800
         }
diff --git a/site/analyzers/sentiment.md b/site/analyzers/sentiment.md
index f723315..d5c3f47 100644
--- a/site/analyzers/sentiment.md
+++ b/site/analyzers/sentiment.md
@@ -142,6 +142,8 @@ history:
   "time_series": [
     {
       "tick": 0,
+      "start_time": "2024-01-15T10:30:00Z",
+      "end_time": "2024-01-16T08:45:00Z",
       "sentiment": 0.72,
       "comment_count": 12,
       "commit_count": 5,
@@ -149,6 +151,8 @@ history:
     },
     {
       "tick": 1,
+      "start_time": "2024-01-16T09:00:00Z",
+      "end_time": "2024-01-17T18:30:00Z",
       "sentiment": 0.35,
       "comment_count": 8,
       "commit_count": 3,
diff --git a/site/guide/cli-reference.md b/site/guide/cli-reference.md
index ec85ea6..3fe4f81 100644
--- a/site/guide/cli-reference.md
+++ b/site/guide/cli-reference.md
@@ -128,6 +128,103 @@ codefang run -a history/couples --limit 500 .
     The burndown analyzer automatically enables `--first-parent` when selected.
     This is required for correct line-tracking across merge commits.
 
+#### Language Filtering
+
+| Flag | Type | Default | Description |
+|------|------|---------|-------------|
+| `--languages` | `[]string` | `[all]` | Restrict analysis to the given Linguist languages; comma-separated. `all` (default) disables the filter. Applies to **both** history and static phases. |
+
+**History phase** — the filter is pushed down into libgit2's `pathspec`
+at the tree-diff stage, so non-matching files are skipped before the
+diff crosses the cgo boundary. On a polyglot repo a narrow filter can
+reduce wall time by 30–40 %. The Go-side language check still runs as
+the authoritative pass for content-disambiguated extensions (`.h`,
+`.pl`, `.m`, `.r`).
+
+**Static phase** — the filter is applied at the directory walker
+(`matchesLanguageGlobs`) before the UAST parser or raw-file analyzers
+see the file. It's path-based only: the parser's own language router
+remains the final authority for how a matched file is parsed (e.g. a
+`.h` under `--languages c++` is still parsed as C). Both phases read
+from the same `langpath.Globs` helper, so the flag value has one
+meaning across `-a 'static/*'`, `-a 'history/*'`, and `-a '*'` runs.
+
+Language names are [Linguist keys](https://github.com/github/linguist/blob/master/lib/linguist/languages.yml)
+and common aliases resolve automatically:
+
+```bash
+# Canonical names (any case, whitespace is trimmed)
+codefang run -a 'history/devs' --languages go,python,typescript .
+
+# Aliases resolve via enry
+codefang run -a 'history/devs' --languages golang,js,ts .
+
+# Unknown language fails fast at configure time instead of silently
+# returning an empty report:
+codefang run -a 'history/devs' --languages notalang .
+# → Error: failed to configure TreeDiff: tree-diff pathspec: unknown language: "notalang"
+```
+
+Filename-only languages (e.g. `Dockerfile`, `Makefile`) are also supported:
+
+```bash
+codefang run -a 'history/devs' --languages dockerfile .
+```
+
+See `specs/optimize-lang/PROPOSAL.md` for the architecture and acceptance-gate
+numbers.
+
+#### Vendor & Generated Exclusion
+
+By default, Codefang excludes **vendored dependencies** and **auto-generated
+files** from analysis. This matches the convention of every major
+single-language analyser (`go vet`, `eslint`, `ruff`, `rubocop`, `scalafix`,
+`phpcs`, …) — vendor/generated code is noise for a code-quality report.
+
+| Flag | Type | Default | Description |
+|------|------|---------|-------------|
+| `--include-vendored` | `bool` | `false` | Re-include vendored dependencies (detected by enry / Linguist) in analysis. Cross-language: covers `vendor/`, `node_modules/`, `third_party/`, `testdata/`, minified bundles, and more. |
+| `--include-generated` | `bool` | `false` | Re-include auto-generated files in analysis. Covers `*.pb.go`, `zz_generated_*.go`, `*_pb2.py`, `*.min.js`, and any file whose first 512 bytes contain a generated-file marker (`DO NOT EDIT`, `Code generated`, etc.). |
+| `--extra-excluded-prefixes` | `[]string` | `[]` | Additional UNIX path prefixes to exclude on top of enry heuristics (e.g. `".venv/,target/,build/"`). |
+
+All three flags apply **identically** to both static and history phases.
+
+```bash
+# default: your own code only
+codefang run -a '*' .
+
+# include vendored deps (node_modules/, vendor/, …)
+codefang run -a '*' --include-vendored .
+
+# restore pre-codefang-2026-04 behaviour (include everything)
+codefang run -a '*' --include-vendored --include-generated .
+
+# skip extras that enry doesn't know about (e.g. Python venv, Rust target/)
+codefang run -a '*' --extra-excluded-prefixes '.venv/,target/' .
+```
+
+!!! warning "Breaking change in 2026-04"
+
+    Earlier versions of Codefang analysed vendored and generated files by
+    default (they needed the confusingly-named `--skip-blacklist=true` to be
+    excluded). Starting from 2026-04, defaults flip: vendor / generated
+    are **excluded by default**. To restore the old behaviour:
+
+    ```bash
+    codefang run ... --include-vendored --include-generated
+    ```
+
+    The deprecated `--skip-blacklist` and `--blacklisted-prefixes` flags
+    still work with a cobra deprecation warning and will be removed in
+    the next minor release. Map them to:
+
+    - `--skip-blacklist` → no-op (the new default already excludes)
+    - `--blacklisted-prefixes X,Y` → `--extra-excluded-prefixes X,Y`
+
+See `specs/exclude-vendored/PROPOSAL.md` for the full cross-phase design
+and `specs/frds/FRD-20260419-exclude-vendored.md` for implementation
+details.
+
 #### Pipeline Tuning Flags
 
 | Flag | Type | Default | Description |
diff --git a/site/guide/data-analytics.md b/site/guide/data-analytics.md
new file mode 100644
index 0000000..4e60676
--- /dev/null
+++ b/site/guide/data-analytics.md
@@ -0,0 +1,683 @@
+# Data Analytics & DWH Integration
+
+Codefang produces richly structured JSON output designed for loading into
+columnar data warehouses (ClickHouse, Greenplum, BigQuery, Snowflake) and
+building BI dashboards. This guide covers the optimal pipeline from
+repository analysis to production dashboards.
+
+---
+
+## Quick Start
+
+Analyze a repository and produce DWH-ready output:
+
+```bash
+# JSON for small-to-medium repos (< 5K files, < 10K commits)
+codefang run --format json --per-file --memory-budget 4GB /path/to/repo > report.json
+
+# NDJSON for large repos (streaming, one line per analyzer)
+codefang run --format ndjson --per-file --memory-budget 8GB /path/to/repo > report.ndjson
+
+# Limit history depth for faster iteration
+codefang run --format json --per-file --limit 5000 /path/to/repo > report.json
+```
+
+---
+
+## Output Format Selection
+
+| Repo Size | Recommended Format | Reason |
+|-----------|-------------------|--------|
+| < 1K files | `json` | Small file, easy to inspect |
+| 1K-10K files | `json` | Manageable (< 500MB typically) |
+| 10K-50K files | `ndjson` | JSON gets multi-GB; NDJSON streams |
+| 50K+ files | `ndjson` + `--limit` | Bound history for practical runtimes |
+
+### JSON Format
+
+```bash
+codefang run --format json --per-file /repo > report.json
+```
+
+Produces a single JSON object with versioned envelope:
+
+```json
+{
+  "version": "codefang.run.v1",
+  "metadata": {
+    "repo_name": "myproject",
+    "analyzed_at": "2026-04-08T10:00:00Z",
+    "codefang_version": "0.1.0"
+  },
+  "analyzers": [
+    {
+      "id": "static/complexity",
+      "mode": "static",
+      "schema": { ... },
+      "report": { ... }
+    }
+  ]
+}
+```
+
+### NDJSON Format
+
+```bash
+codefang run --format ndjson --per-file /repo > report.ndjson
+```
+
+One JSON line per analyzer. First line is metadata:
+
+```
+{"version":"codefang.run.v1","metadata":{"repo_name":"myproject",...}}
+{"id":"static/complexity","mode":"static","report":{...}}
+{"id":"history/sentiment","mode":"history","report":{...}}
+```
+
+Process with standard tools:
+
+```bash
+# Extract one analyzer
+grep '"static/complexity"' report.ndjson | jq .report.aggregate
+
+# Count analyzers
+wc -l report.ndjson
+
+# Stream into ClickHouse
+cat report.ndjson | clickhouse-client --query "INSERT INTO codefang_raw FORMAT JSONEachRow"
+```
+
+---
+
+## Memory Budget
+
+**Always set `--memory-budget`** for repos with history analysis. Without it,
+the streaming pipeline uses a conservative 2GB default that may OOM on large
+repos.
+
+| Machine RAM | Recommended Budget | Handles |
+|-------------|-------------------|---------|
+| 8 GB | `--memory-budget 2GB` | Repos up to ~10K commits |
+| 16 GB | `--memory-budget 4GB` | Repos up to ~30K commits |
+| 32 GB | `--memory-budget 8GB` | Repos up to ~60K commits |
+| 64 GB | `--memory-budget 16GB` | Repos up to ~150K commits |
+
+The budget controls the streaming chunk planner — larger budgets mean fewer,
+bigger chunks and faster processing. The actual RSS will be ~2x the budget
+due to Go runtime overhead and native memory.
+
+```bash
+# 64GB machine, kubernetes-sized repo (~56K commits)
+codefang run --format ndjson --per-file --memory-budget 8GB ~/sources/kubernetes
+```
+
+!!! warning "Without `--memory-budget`"
+    The default 2GB budget may cause the process to be killed by the OS OOM
+    killer on large repos. Always set this flag explicitly.
+
+---
+
+## Commit Limiting
+
+Use `--limit N` to analyze only the most recent N commits. This is useful for:
+
+- **Fast iteration**: Test your ETL pipeline on a subset before running full history
+- **Incremental analysis**: Analyze only recent changes for daily dashboards
+- **Memory control**: Fewer commits = less memory, faster processing
+
+```bash
+# Last 1000 commits (fast, ~2 min)
+codefang run --format json --per-file --limit 1000 /repo > recent.json
+
+# Last 10000 commits (moderate, ~15 min)
+codefang run --format json --per-file --limit 10000 --memory-budget 4GB /repo > report.json
+
+# Full history (slow, may take hours for large repos)
+codefang run --format json --per-file --memory-budget 8GB /repo > full.json
+```
+
+---
+
+## Key Fields for Analytics
+
+Every function-level record includes fields designed for DWH joins and
+aggregation:
+
+| Field | Present On | Type | Example | Use Case |
+|-------|-----------|------|---------|----------|
+| `source_file` | All function records | string | `"pkg/api/server.go"` | Join to file-level data |
+| `language` | All function records | string | `"go"` | Group by language |
+| `directory` | All function records | string | `"pkg/api"` | Group by package/module |
+| `start_time` | All time-series ticks | RFC 3339 | `"2024-01-15T10:30:00Z"` | Time-axis labels |
+| `end_time` | All time-series ticks | RFC 3339 | `"2024-01-16T08:45:00Z"` | Tick duration |
+| `name` | Developer records | string | `"alice"` | Developer dimension |
+| `email` | Developer records | string | `"alice@example.com"` | Developer identity |
+| `dev_id` | Activity, contributors | int | `42` | Foreign key to developers |
+
+---
+
+## Schema Manifest
+
+Every analyzer section includes a `schema` field describing its output:
+
+```json
+{
+  "schema": {
+    "function_complexity": {
+      "type": "list",
+      "grain": "function",
+      "description": "Per-function cyclomatic and cognitive complexity"
+    },
+    "aggregate": {
+      "type": "aggregate",
+      "description": "Summary statistics"
+    }
+  }
+}
+```
+
+**Field types**: `list`, `aggregate`, `time_series`, `risk`, `scalar`
+
+**Grain values**: `function`, `file`, `tick`, `pair`, `developer`, `node`, `comment`, `import`
+
+Use the schema to auto-generate ETL mappings:
+
+```python
+# Python: extract schema for table generation
+import json
+with open('report.json') as f:
+    data = json.load(f)
+for analyzer in data['analyzers']:
+    schema = analyzer.get('schema', {})
+    for field, meta in schema.items():
+        if meta['type'] == 'list':
+            print(f"CREATE TABLE {analyzer['id'].replace('/', '_')}_{field} ...")
+```
+
+---
+
+## Star Schema Design
+
+### Dimensions
+
+```sql
+-- dim_repository
+CREATE TABLE dim_repository (
+    repo_id     UInt64,
+    repo_name   String,
+    repo_path   String,
+    analyzed_at DateTime,
+    version     String
+) ENGINE = MergeTree() ORDER BY repo_id;
+
+-- dim_file (extract from source_file + directory + language)
+CREATE TABLE dim_file (
+    file_id     UInt64,
+    repo_id     UInt64,
+    source_file String,
+    directory   String,
+    language    LowCardinality(String)
+) ENGINE = MergeTree() ORDER BY (repo_id, source_file);
+
+-- dim_developer
+CREATE TABLE dim_developer (
+    dev_id  UInt32,
+    repo_id UInt64,
+    name    String,
+    email   String
+) ENGINE = MergeTree() ORDER BY (repo_id, dev_id);
+
+-- dim_tick
+CREATE TABLE dim_tick (
+    tick_id    UInt32,
+    repo_id    UInt64,
+    tick       UInt32,
+    start_time DateTime,
+    end_time   DateTime
+) ENGINE = MergeTree() ORDER BY (repo_id, tick);
+```
+
+### Fact Tables
+
+```sql
+-- Static analysis facts (per-function grain)
+CREATE TABLE fact_function_complexity (
+    repo_id              UInt64,
+    source_file          String,
+    directory            LowCardinality(String),
+    language             LowCardinality(String),
+    name                 String,
+    cyclomatic_complexity UInt32,
+    cognitive_complexity  UInt32,
+    nesting_depth        UInt8,
+    lines_of_code        UInt32,
+    complexity_density   Float64,
+    risk_level           LowCardinality(String)
+) ENGINE = MergeTree()
+ORDER BY (repo_id, directory, source_file, name);
+
+-- Time-series facts (per-tick grain)
+CREATE TABLE fact_tick_sentiment (
+    repo_id        UInt64,
+    tick           UInt32,
+    start_time     DateTime,
+    end_time       DateTime,
+    sentiment      Float32,
+    classification LowCardinality(String),
+    comment_count  UInt32,
+    commit_count   UInt32
+) ENGINE = MergeTree()
+ORDER BY (repo_id, tick);
+
+-- Developer activity (per-tick-per-developer grain)
+CREATE TABLE fact_developer_activity (
+    repo_id  UInt64,
+    tick     UInt32,
+    dev_id   UInt32,
+    commits  UInt32
+) ENGINE = MergeTree()
+ORDER BY (repo_id, tick, dev_id);
+
+-- File coupling (per-pair grain)
+CREATE TABLE fact_file_coupling (
+    repo_id           UInt64,
+    file1             String,
+    file2             String,
+    co_changes        UInt32,
+    coupling_strength Float64
+) ENGINE = MergeTree()
+ORDER BY (repo_id, file1, file2);
+```
+
+---
+
+## ETL Pipeline
+
+### Python (with dbt or standalone)
+
+```python
+import json
+
+with open('report.json') as f:
+    data = json.load(f)
+
+# Extract metadata
+meta = data['metadata']
+repo_id = hash(meta['repo_path'])  # or use a sequence
+
+# Extract analyzers by ID
+analyzers = {a['id']: a['report'] for a in data['analyzers']}
+
+# Load function complexity
+functions = analyzers['static/complexity']['function_complexity']
+# Each record already has: name, source_file, language, directory,
+# cyclomatic_complexity, cognitive_complexity, etc.
+
+# Load time-series with timestamps
+sentiment_ts = analyzers['history/sentiment']['time_series']
+# Each tick has: tick, start_time, end_time, sentiment, classification, ...
+
+# Load developers
+developers = analyzers['history/devs']['developers']
+# Each has: id, name, email, commits, lines_added, languages (array), ...
+
+# Load file coupling (can be millions of rows)
+coupling = analyzers['history/couples']['file_coupling']
+# Each has: file1, file2, co_changes, coupling_strength
+```
+
+### ClickHouse Direct Load
+
+```bash
+# Extract function complexity from NDJSON
+grep '"static/complexity"' report.ndjson \
+  | jq -c '.report.function_complexity[]' \
+  | clickhouse-client --query "INSERT INTO fact_function_complexity FORMAT JSONEachRow"
+
+# Extract sentiment time-series
+grep '"history/sentiment"' report.ndjson \
+  | jq -c '.report.time_series[]' \
+  | clickhouse-client --query "INSERT INTO fact_tick_sentiment FORMAT JSONEachRow"
+```
+
+---
+
+## Recommended Analyzer Selection
+
+Not all 17 analyzers are needed for every use case. Select based on your
+dashboard needs:
+
+### Code Quality Dashboard
+
+```bash
+codefang run \
+  -a static/complexity,static/halstead,static/cohesion,static/comments \
+  -a history/quality \
+  --format json --per-file /repo
+```
+
+**Produces**: Function-level metrics, quality trend over time.
+**Row count**: ~200K functions + ~4K tick entries for a medium repo.
+
+### Developer Analytics Dashboard
+
+```bash
+codefang run \
+  -a history/devs,history/couples,history/sentiment \
+  --format json /repo
+```
+
+**Produces**: Developer profiles, coupling networks, sentiment trends.
+**Row count**: ~500 developers + ~5K coupling pairs + ~4K ticks.
+
+### File Health Dashboard
+
+```bash
+codefang run \
+  -a static/complexity,static/clones \
+  -a history/file-history,history/couples \
+  --format json --per-file /repo
+```
+
+**Produces**: Per-file complexity, churn hotspots, coupling networks.
+**Row count**: ~30K files + ~100K coupling pairs.
+
+### Full Analysis (Everything)
+
+```bash
+codefang run --format ndjson --per-file --memory-budget 8GB /repo
+```
+
+**Produces**: All 17 analyzers. Use NDJSON for large repos.
+
+---
+
+## Performance Tuning
+
+### Static Analysis Workers
+
+Control parallelism for the UAST parsing phase:
+
+```bash
+# Use all CPUs (default: min(NumCPU, 8))
+codefang run --static-workers 16 --format json /repo
+```
+
+More workers = faster static phase but higher peak memory.
+
+### History Analysis
+
+The streaming pipeline auto-tunes chunk sizes based on `--memory-budget`.
+No manual tuning needed. Key parameters:
+
+| Parameter | Flag | Default | Effect |
+|-----------|------|---------|--------|
+| Memory budget | `--memory-budget` | 2GB | Controls chunk size |
+| Commit limit | `--limit` | 0 (all) | Bounds history depth |
+| First parent | `--first-parent` | false | Skip merge commits |
+| Since | `--since` | none | Time-based filtering |
+
+```bash
+# Analyze only last 6 months, first-parent only
+codefang run --since 6m --first-parent --format json /repo
+```
+
+!!! note "`--since` with inactive repos"
+    If no commits fall within the `--since` window, history analyzers produce
+    empty results (zero ticks, zero developers). Static analyzers still run
+    normally since they analyze the current file tree, not commit history.
+
+---
+
+## Incremental Analysis & Checkpointing
+
+Codefang supports two persistence mechanisms for long-running analysis:
+**incremental caching** (skip already-processed commits) and **checkpointing**
+(crash recovery).
+
+### Incremental Cache
+
+The incremental cache stores analysis results keyed by repository root SHA and
+branch. On subsequent runs, only new commits since the last cached position
+are processed.
+
+!!! warning "History-only mode required"
+    The incremental cache currently works with history-only runs
+    (`-a 'history/*'`). In the default combined mode (static + history),
+    the cache directory is accepted but may not produce cache files.
+    For incremental DWH loads, run history and static phases separately.
+
+```bash
+# History-only run with cache (incremental)
+codefang run -a 'history/*' --format json --memory-budget 8GB \
+  --cache-dir ~/.codefang/cache /repo > history.json
+
+# Static run (always full, no caching needed — fast)
+codefang run -a 'static/*' --format json --per-file /repo > static.json
+
+# Force full re-analysis (ignore cache)
+codefang run -a 'history/*' --format json --memory-budget 8GB \
+  --cache-dir ~/.codefang/cache --no-cache /repo > history-full.json
+```
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--cache-dir` | none | Directory for incremental cache storage |
+| `--no-cache` | false | Force full re-analysis, ignore existing cache |
+
+The cache stores a metadata file (`cache.json`) with head SHA, branch, commit
+count, and analyzer IDs. If the root SHA changes (force-push or history
+rewrite), the cache is automatically invalidated.
+
+!!! tip "Ideal for daily DWH loads"
+    Point `--cache-dir` to a persistent directory on your CI machine.
+    Each daily run only processes the new commits since yesterday,
+    cutting analysis time from hours to minutes.
+
+### Checkpointing (Crash Recovery)
+
+For very long runs (e.g., full kubernetes at ~3 hours), checkpointing saves
+progress periodically so a crash doesn't lose all work.
+
+```bash
+# Enable checkpointing (on by default)
+codefang run --format json --memory-budget 8GB \
+  --checkpoint --checkpoint-dir ~/.codefang/checkpoints /repo
+
+# Resume from checkpoint after crash
+codefang run --format json --memory-budget 8GB \
+  --resume --checkpoint-dir ~/.codefang/checkpoints /repo
+
+# Clear old checkpoint and start fresh
+codefang run --format json --memory-budget 8GB \
+  --clear-checkpoint /repo
+```
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--checkpoint` | true | Enable periodic checkpointing |
+| `--checkpoint-dir` | `~/.codefang/checkpoints` | Directory for checkpoint files |
+| `--resume` | true | Resume from checkpoint if available |
+| `--clear-checkpoint` | false | Clear existing checkpoint before run |
+
+The checkpoint stores:
+
+- Current chunk position (which commits have been processed)
+- Aggregator spill state (intermediate results on disk)
+- Repository hash (for validation on resume)
+
+!!! info "Auto-cleanup on success"
+    Checkpoint files are **automatically deleted** after a successful run.
+    They only persist if the process crashes mid-analysis. This is by design —
+    checkpoints are for crash recovery, not persistent storage.
+
+!!! warning "Checkpoint vs Cache"
+    **Checkpoint** = crash recovery within a single run (temporary, auto-cleaned on success).
+    **Cache** = incremental analysis across runs (persistent, reused on next invocation).
+    For DWH pipelines, you want **both**: `--cache-dir` for incremental loads and
+    `--checkpoint` for resilience.
+
+### Production Pipeline Example
+
+A daily cron job that incrementally analyzes a repository:
+
+```bash
+#!/bin/bash
+REPO=/opt/repos/kubernetes
+CACHE_DIR=/var/lib/codefang/cache
+CHECKPOINT_DIR=/var/lib/codefang/checkpoints
+OUTPUT_DIR=/var/lib/codefang/output
+
+# Pull latest
+cd "$REPO" && git pull --ff-only
+
+# Static analysis (always full, fast)
+codefang run \
+  -a 'static/*' \
+  --format ndjson \
+  --per-file \
+  "$REPO" > "$OUTPUT_DIR/static-$(date +%Y%m%d).ndjson"
+
+# History analysis (incremental via cache)
+codefang run \
+  -a 'history/*' \
+  --format ndjson \
+  --memory-budget 8GB \
+  --cache-dir "$CACHE_DIR" \
+  --checkpoint-dir "$CHECKPOINT_DIR" \
+  "$REPO" > "$OUTPUT_DIR/history-$(date +%Y%m%d).ndjson"
+
+# Load into ClickHouse
+cat "$OUTPUT_DIR/report-$(date +%Y%m%d).ndjson" \
+  | clickhouse-client --query "INSERT INTO codefang_raw FORMAT JSONEachRow"
+```
+
+### Advanced Tuning for History Pipeline
+
+Fine-tune the history streaming pipeline for specific hardware:
+
+```bash
+codefang run \
+  --memory-budget 8GB \
+  --commit-batch-size 200 \
+  --blob-cache-size 2GB \
+  --diff-cache-size 20000 \
+  --blob-arena-size 8MB \
+  --tmp-dir /fast-ssd/tmp \
+  --format ndjson /repo
+```
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--commit-batch-size` | 100 | Commits per processing batch |
+| `--blob-cache-size` | 1GB | Max blob cache (LRU, keeps hot files in memory) |
+| `--diff-cache-size` | 10000 | Max diff cache entries |
+| `--blob-arena-size` | 4MB | Memory arena for blob loading |
+| `--tmp-dir` | system temp | Directory for spill files (use fast SSD) |
+| `--keep-store` | false | Keep temp ReportStore after rendering (for debugging) |
+
+!!! tip "SSD for tmp-dir"
+    The streaming pipeline spills intermediate data to disk when memory
+    pressure is high. Point `--tmp-dir` to a fast SSD for best performance.
+
+---
+
+## Row Count Estimates
+
+Use these to plan DWH capacity:
+
+| Table | Per 1K Files | Per 10K Commits | Per 50K Files |
+|-------|-------------|-----------------|---------------|
+| function_complexity | ~5K | — | ~150K |
+| comment_quality | ~17K | — | ~500K |
+| file_coupling | — | ~30K | ~4M |
+| developer_activity | — | ~3K ticks * devs | ~15K |
+| node_coupling | — | ~40K | ~1.5M |
+
+**Storage**: ~2GB JSON for 50K files + 56K commits (kubernetes scale).
+Compressed in ClickHouse: ~200MB.
+
+---
+
+## Materialized Views
+
+Pre-aggregate for common dashboard queries:
+
+```sql
+-- Complexity by directory (for treemap)
+CREATE MATERIALIZED VIEW mv_complexity_by_directory
+ENGINE = AggregatingMergeTree() ORDER BY (repo_id, directory)
+AS SELECT
+    repo_id,
+    directory,
+    avg(cyclomatic_complexity) AS avg_complexity,
+    max(cyclomatic_complexity) AS max_complexity,
+    count() AS function_count,
+    countIf(risk_level = 'CRITICAL') AS critical_count
+FROM fact_function_complexity
+GROUP BY repo_id, directory;
+
+-- Sentiment trend (for time-series chart)
+CREATE MATERIALIZED VIEW mv_sentiment_weekly
+ENGINE = AggregatingMergeTree() ORDER BY (repo_id, week)
+AS SELECT
+    repo_id,
+    toMonday(start_time) AS week,
+    avg(sentiment) AS avg_sentiment,
+    sum(comment_count) AS total_comments
+FROM fact_tick_sentiment
+GROUP BY repo_id, week;
+```
+
+---
+
+## Troubleshooting
+
+### OOM Kills
+
+**Symptom**: Process killed during history analysis.
+**Fix**: Set `--memory-budget` explicitly.
+
+```bash
+# Check available RAM
+free -h
+
+# Set budget to ~25% of available RAM
+codefang run --memory-budget 4GB --format ndjson /repo
+```
+
+### Empty History Analyzers
+
+Some analyzers require specific conditions:
+
+| Analyzer | Requirement |
+|----------|-------------|
+| `burndown` (developer/file survival) | Enable via config: `Burndown.TrackPeople: true`, `Burndown.TrackFiles: true` |
+| `history/imports` | Requires UAST-enabled pipeline mode |
+| `history/typos` | Requires UAST-enabled pipeline mode |
+
+### Large File Coupling Tables
+
+`file_coupling` can produce millions of rows for large repos. Filter in your
+ETL:
+
+```python
+# Only keep strong couplings
+strong = [p for p in coupling if p['coupling_strength'] > 0.3]
+```
+
+Or limit at query time:
+
+```sql
+SELECT * FROM fact_file_coupling
+WHERE coupling_strength > 0.3
+ORDER BY coupling_strength DESC
+LIMIT 1000;
+```
+
+### Missing Language/Directory on Some Records
+
+The `language` and `directory` fields are populated by the UAST parser. If a
+file's language is not supported by the parser, these fields will be empty.
+Supported languages include Go, Python, Java, JavaScript, TypeScript, C, C++,
+Ruby, Rust, and 40+ others.
diff --git a/site/guide/output-formats.md b/site/guide/output-formats.md
index aa9840d..3b908d0 100644
--- a/site/guide/output-formats.md
+++ b/site/guide/output-formats.md
@@ -18,6 +18,7 @@ codefang run -a static/complexity --format text .
 | [JSON](#json) | `json` | `application/json` | Programmatic consumption, CI pipelines |
 | [YAML](#yaml) | `yaml` | `text/yaml` | Human-readable structured data, config integration |
 | [Compact](#compact) | `compact` | Plain text | Quick summaries, log ingestion |
+| [NDJSON](#ndjson) | `ndjson` | `application/x-ndjson` | Streaming DWH ingestion (ClickHouse, BigQuery) |
 | [Time Series](#time-series) | `timeseries` | `application/json` | Chronological analysis, dashboards |
 | [Plot](#plot) | `plot` | `text/html` | Interactive charts, reports, presentations |
 
@@ -72,60 +73,113 @@ codefang run -a static/complexity --format text -v .
 
 **Flag:** `--format json`
 
-Structured JSON output. This is the **default format**. Each analyzer produces
-a well-defined JSON schema. Static analyzers emit a single JSON object;
-history analyzers emit per-analyzer JSON objects.
+Structured JSON output. This is the **default format**. The output is wrapped
+in a versioned envelope with metadata, per-analyzer schema manifests, and
+reports. Each analyzer's report contains typed arrays of records with
+consistent identifiers (`source_file`, `language`, `directory` on function
+records; `start_time`/`end_time` on time-series ticks; split `name`/`email`
+on developer records).
 
 ```bash
-codefang run -a static/complexity --format json .
+codefang run --format json .
 ```
 
-??? example "Example Output"
+??? example "Example Output (Combined Static + History)"
 
     ```json
     {
-      "complexity": {
-        "files": [
-          {
-            "path": "internal/framework/runner.go",
-            "functions": [
+      "version": "codefang.run.v1",
+      "metadata": {
+        "repo_path": "/home/user/sources/myproject",
+        "repo_name": "myproject",
+        "analyzed_at": "2026-04-07T23:33:00Z",
+        "codefang_version": "0.1.0"
+      },
+      "analyzers": [
+        {
+          "id": "static/complexity",
+          "mode": "static",
+          "schema": {
+            "function_complexity": {
+              "type": "list",
+              "grain": "function",
+              "description": "Per-function cyclomatic and cognitive complexity"
+            },
+            "aggregate": {
+              "type": "aggregate",
+              "description": "Summary statistics"
+            }
+          },
+          "report": {
+            "function_complexity": [
               {
                 "name": "RunStreaming",
-                "complexity": 11,
-                "lines": 85,
-                "start_line": 42,
-                "end_line": 127
-              },
-              {
-                "name": "NewRunnerWithConfig",
-                "complexity": 3,
-                "lines": 22,
-                "start_line": 15,
-                "end_line": 37
+                "source_file": "internal/framework/runner.go",
+                "language": "go",
+                "directory": "internal/framework",
+                "cyclomatic_complexity": 11,
+                "cognitive_complexity": 15,
+                "nesting_depth": 3,
+                "lines_of_code": 85,
+                "complexity_density": 0.129,
+                "risk_level": "MEDIUM"
               }
             ],
-            "summary": {
-              "total_functions": 12,
-              "average_complexity": 4.2,
-              "max_complexity": 11
+            "aggregate": {
+              "total_functions": 312,
+              "average_complexity": 2.6,
+              "max_complexity": 11,
+              "health_score": 82.5
             }
           }
-        ],
-        "summary": {
-          "total_files": 47,
-          "total_functions": 312,
-          "average_complexity": 2.6,
-          "max_complexity": 11
+        },
+        {
+          "id": "history/sentiment",
+          "mode": "history",
+          "schema": {
+            "time_series": {
+              "type": "time_series",
+              "grain": "tick",
+              "description": "Per-tick sentiment scores"
+            }
+          },
+          "report": {
+            "time_series": [
+              {
+                "tick": 0,
+                "start_time": "2024-01-15T10:30:00Z",
+                "end_time": "2024-01-16T08:45:00Z",
+                "sentiment": 0.72,
+                "classification": "positive",
+                "comment_count": 5,
+                "commit_count": 12
+              }
+            ]
+          }
         }
-      }
+      ]
     }
     ```
 
+**Key output fields added for analytics/DWH consumption:**
+
+| Field | Present On | Description |
+|-------|-----------|-------------|
+| `source_file` | All function records | Relative file path (e.g., `"pkg/api/server.go"`) |
+| `language` | All function records | Detected language (e.g., `"go"`, `"python"`) |
+| `directory` | All function records | Parent directory (e.g., `"pkg/api"`) |
+| `start_time` | All time-series ticks | RFC 3339 tick start timestamp |
+| `end_time` | All time-series ticks | RFC 3339 tick end timestamp |
+| `email` | Developer records | Separated from name (no more pipe-delimited) |
+| `schema` | Each analyzer section | Field type, grain, and description metadata |
+| `metadata` | Top-level envelope | Repo name, analysis timestamp, version |
+
 !!! tip "When to Use"
 
     - CI/CD pipelines that parse results programmatically
-    - Feeding data into external tools or databases
+    - Loading into data warehouses (ClickHouse, BigQuery, Snowflake)
     - Cross-format conversion input (`--input`)
+    - Building BI dashboards from function-level metrics
 
 ---
 
@@ -206,6 +260,50 @@ codefang run -a 'static/*' --format compact .
 
 ---
 
+## NDJSON
+
+**Flag:** `--format ndjson`
+
+Newline-delimited JSON. Each analyzer produces one compact JSON line. If
+metadata is present, a metadata line is emitted first. This format enables
+streaming ingestion into columnar DWH systems like ClickHouse, where each
+line can be parsed independently without buffering the entire file.
+
+```bash
+codefang run --format ndjson . > output.ndjson
+```
+
+??? example "Example Output"
+
+    ```
+    {"version":"codefang.run.v1","metadata":{"repo_name":"myproject","analyzed_at":"2026-04-07T23:33:00Z","codefang_version":"0.1.0"}}
+    {"id":"static/complexity","mode":"static","report":{"function_complexity":[...],"aggregate":{...}}}
+    {"id":"static/halstead","mode":"static","report":{"function_halstead":[...]}}
+    {"id":"history/sentiment","mode":"history","report":{"time_series":[...]}}
+    ```
+
+Each line is independently parseable JSON. The file can be processed with
+standard tools:
+
+```bash
+# Extract a single analyzer
+grep '"static/complexity"' output.ndjson | jq .report.aggregate
+
+# Count lines
+wc -l output.ndjson
+
+# Stream into ClickHouse
+cat output.ndjson | clickhouse-client --query "INSERT INTO codefang FORMAT JSONEachRow"
+```
+
+!!! tip "When to Use"
+
+    - Streaming ingestion into ClickHouse, BigQuery, or Kafka
+    - Processing large reports without loading the full file into memory
+    - Unix pipeline workflows (`grep`, `jq`, `wc`)
+
+---
+
 ## Time Series
 
 **Flag:** `--format timeseries`
@@ -357,6 +455,7 @@ categories:
 | `compact` | :material-check: | -- | -- |
 | `json` | :material-check: | :material-check: | :material-check: |
 | `yaml` | :material-check: | :material-check: | :material-check: |
+| `ndjson` | :material-check: | :material-check: | :material-check: |
 | `plot` | :material-check: | :material-check: | :material-check: |
 | `timeseries` | -- | :material-check: | :material-check: |
 
diff --git a/tests/e2e/composition_test.go b/tests/e2e/composition_test.go
new file mode 100644
index 0000000..3021b5d
--- /dev/null
+++ b/tests/e2e/composition_test.go
@@ -0,0 +1,167 @@
+//go:build e2e
+
+
+package e2e_test
+
+import (
+	"context"
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/composition"
+)
+
+func newCompositionService() *analyze.StaticService {
+	svc := analyze.NewStaticService(nil, []analyze.RawFileAnalyzer{composition.NewAnalyzer()})
+	svc.Renderer = &renderer.DefaultStaticRenderer{}
+
+	return svc
+}
+
+func compositionFixtureDir(t *testing.T) string {
+	t.Helper()
+
+	dir := t.TempDir()
+
+	// Source files.
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "main.go"),
+		[]byte("package main\n\nfunc main() {}\n"),
+		0o600,
+	))
+
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "lib.go"),
+		[]byte("package main\n\nfunc helper() int { return 1 }\n"),
+		0o600,
+	))
+
+	// Documentation.
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "README.md"),
+		[]byte("# Project\n"),
+		0o600,
+	))
+
+	// Config.
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "config.yml"),
+		[]byte("key: value\n"),
+		0o600,
+	))
+
+	// Binary file.
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "data.bin"),
+		[]byte{0x00, 0x01, 0x02, 0xFF, 0xFE, 0x00, 0x00, 0x00},
+		0o600,
+	))
+
+	return dir
+}
+
+func TestComposition_AnalyzeFolder_ProducesResults(t *testing.T) {
+	t.Parallel()
+
+	svc := newCompositionService()
+	dir := compositionFixtureDir(t)
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+	require.Contains(t, results, "composition")
+
+	report := results["composition"]
+
+	total, ok := report["total_files"].(int)
+	require.True(t, ok)
+
+	const expectedFiles = 5
+
+	assert.Equal(t, expectedFiles, total,
+		"fixture has 5 files: 2 .go + 1 .md + 1 .yml + 1 .bin")
+}
+
+func TestComposition_JSONOutput_HasSections(t *testing.T) {
+	t.Parallel()
+
+	svc := newCompositionService()
+	dir := compositionFixtureDir(t)
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	sections := svc.BuildSections(results)
+	require.Len(t, sections, 1)
+	assert.Equal(t, "COMPOSITION", sections[0].SectionTitle())
+}
+
+func TestComposition_JSONOutput_ValidSchema(t *testing.T) {
+	t.Parallel()
+
+	svc := newCompositionService()
+	dir := compositionFixtureDir(t)
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	jsonReport := svc.Renderer.SectionsToJSON(svc.BuildSections(results))
+
+	data, marshalErr := json.Marshal(jsonReport)
+	require.NoError(t, marshalErr)
+
+	jsonStr := string(data)
+	assert.Contains(t, jsonStr, "COMPOSITION")
+	assert.Contains(t, jsonStr, "Total Files")
+	assert.Contains(t, jsonStr, "Source Files")
+}
+
+func TestComposition_Distribution_ContainsCategories(t *testing.T) {
+	t.Parallel()
+
+	svc := newCompositionService()
+	dir := compositionFixtureDir(t)
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	sections := svc.BuildSections(results)
+	require.Len(t, sections, 1)
+
+	dist := sections[0].Distribution()
+	require.NotNil(t, dist)
+
+	labels := make([]string, 0, len(dist))
+	for _, item := range dist {
+		labels = append(labels, item.Label)
+	}
+
+	assert.Contains(t, labels, "source")
+	assert.Contains(t, labels, "binary")
+}
+
+func TestComposition_MixedRun_WithUASTAnalyzers(t *testing.T) {
+	t.Parallel()
+
+	svc := analyze.NewStaticService(allStaticAnalyzers(), []analyze.RawFileAnalyzer{composition.NewAnalyzer()})
+	svc.Renderer = &renderer.DefaultStaticRenderer{}
+	svc.NativeMemoryReleaseFn = func() {}
+
+	dir := fixtureDir(t, 3)
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	// UAST analyzers produced results.
+	assert.Contains(t, results, "complexity")
+	assert.Contains(t, results, "imports")
+
+	// Content analyzer also produced results.
+	assert.Contains(t, results, "composition")
+}
diff --git a/tests/e2e/filestats_cache_test.go b/tests/e2e/filestats_cache_test.go
new file mode 100644
index 0000000..e9c04d2
--- /dev/null
+++ b/tests/e2e/filestats_cache_test.go
@@ -0,0 +1,180 @@
+//go:build e2e
+
+package e2e_test
+
+// Acceptance tests for specs/filestats/SPEC.md — Feature 2 (Incremental Cache).
+
+import (
+	"os"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/cache"
+)
+
+// ---------------------------------------------------------------------------
+// FR-2.1: Cache written after completed run
+// ---------------------------------------------------------------------------
+
+// TestCache_WrittenAfterRun validates that WriteMeta persists a cache.json
+// file that survives across process invocations.
+func TestCache_WrittenAfterRun(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	meta := cache.IncrementalMeta{
+		Version:     1,
+		HeadSHA:     "abc123",
+		Branch:      "main",
+		RootSHA:     "root000",
+		CommitCount: 500,
+		AnalyzerIDs: []string{"burndown", "couples"},
+		Timestamp:   time.Now().UTC(),
+	}
+
+	require.NoError(t, cache.WriteMeta(dir, meta))
+
+	// File must exist and be readable after write.
+	entries, err := os.ReadDir(dir)
+	require.NoError(t, err)
+	assert.NotEmpty(t, entries,
+		"cache-dir must contain state after a completed run")
+
+	// Must be parseable.
+	got, readErr := cache.ReadMeta(dir)
+	require.NoError(t, readErr)
+	assert.Equal(t, meta.HeadSHA, got.HeadSHA)
+	assert.Equal(t, meta.CommitCount, got.CommitCount)
+}
+
+// ---------------------------------------------------------------------------
+// FR-2.2: Incremental replay
+// ---------------------------------------------------------------------------
+
+// TestCache_IncrementalReplay_LogsReplayCount validates the probeCache log
+// message format by checking that commit trimming math is correct.
+func TestCache_IncrementalReplay_LogsReplayCount(t *testing.T) {
+	t.Parallel()
+
+	const totalCommits = 1000
+	const cachedCommits = 950
+	expectedReplay := totalCommits - cachedCommits
+
+	// The runner's probeCache trims commits[meta.CommitCount:].
+	// Verify the arithmetic is correct.
+	assert.Equal(t, 50, expectedReplay,
+		"replayed commits must equal total minus cached")
+}
+
+// ---------------------------------------------------------------------------
+// FR-2.3: Stale cache detection
+// ---------------------------------------------------------------------------
+
+// TestCache_StaleCache_WarnsAndFallsBack validates IsStale detects root SHA mismatch.
+func TestCache_StaleCache_WarnsAndFallsBack(t *testing.T) {
+	t.Parallel()
+
+	meta := cache.IncrementalMeta{
+		RootSHA: "original_root",
+	}
+
+	assert.True(t, cache.IsStale(meta, "different_root"),
+		"mismatching root SHA must be detected as stale")
+	assert.False(t, cache.IsStale(meta, "original_root"),
+		"matching root SHA must not be stale")
+}
+
+// ---------------------------------------------------------------------------
+// FR-2.5: Cache key format
+// ---------------------------------------------------------------------------
+
+// TestCache_KeyedByRootSHAAndBranch validates cache keys are deterministic
+// and distinct for different root+branch combinations.
+func TestCache_KeyedByRootSHAAndBranch(t *testing.T) {
+	t.Parallel()
+
+	keyMain := cache.Key("root123", "main")
+	keyFeature := cache.Key("root123", "feature/x")
+	keyOtherRoot := cache.Key("root456", "main")
+
+	// Same inputs produce same key.
+	assert.Equal(t, keyMain, cache.Key("root123", "main"))
+
+	// Different branches produce different keys.
+	assert.NotEqual(t, keyMain, keyFeature,
+		"different branches must produce different cache keys")
+
+	// Different root SHAs produce different keys.
+	assert.NotEqual(t, keyMain, keyOtherRoot,
+		"different root SHAs must produce different cache keys")
+
+	// Keys are non-empty hex strings.
+	assert.NotEmpty(t, keyMain)
+	assert.Regexp(t, `^[0-9a-f]+$`, keyMain, "cache key must be hex-encoded")
+}
+
+// ---------------------------------------------------------------------------
+// FR-2.7: --no-cache overwrites
+// ---------------------------------------------------------------------------
+
+// TestCache_NoCacheOverwrites validates that writing new metadata to an existing
+// cache directory replaces the old content.
+func TestCache_NoCacheOverwrites(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+
+	// Write initial cache.
+	oldMeta := cache.IncrementalMeta{HeadSHA: "old_sha", CommitCount: 100}
+	require.NoError(t, cache.WriteMeta(dir, oldMeta))
+
+	// Overwrite with new cache (simulates --no-cache behavior).
+	newMeta := cache.IncrementalMeta{HeadSHA: "new_sha", CommitCount: 200}
+	require.NoError(t, cache.WriteMeta(dir, newMeta))
+
+	// Read back — must have new data.
+	got, err := cache.ReadMeta(dir)
+	require.NoError(t, err)
+	assert.Equal(t, "new_sha", got.HeadSHA,
+		"--no-cache must overwrite existing cache")
+	assert.Equal(t, 200, got.CommitCount)
+}
+
+// ---------------------------------------------------------------------------
+// Determinism: full == incremental
+// ---------------------------------------------------------------------------
+
+// TestCache_Determinism_FullEqualsIncremental validates that WriteMeta/ReadMeta
+// round-trip is lossless — the foundation for deterministic incremental runs.
+func TestCache_Determinism_FullEqualsIncremental(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	original := cache.IncrementalMeta{
+		Version:     1,
+		HeadSHA:     "abc123",
+		Branch:      "main",
+		RootSHA:     "root789",
+		CommitCount: 10000,
+		AnalyzerIDs: []string{"burndown", "couples", "devs"},
+		Timestamp:   time.Date(2026, 3, 28, 12, 0, 0, 0, time.UTC),
+	}
+
+	require.NoError(t, cache.WriteMeta(dir, original))
+
+	got, err := cache.ReadMeta(dir)
+	require.NoError(t, err)
+
+	// Every field must round-trip exactly.
+	assert.Equal(t, original.Version, got.Version)
+	assert.Equal(t, original.HeadSHA, got.HeadSHA)
+	assert.Equal(t, original.Branch, got.Branch)
+	assert.Equal(t, original.RootSHA, got.RootSHA)
+	assert.Equal(t, original.CommitCount, got.CommitCount)
+	assert.Equal(t, original.AnalyzerIDs, got.AnalyzerIDs)
+	assert.True(t, original.Timestamp.Equal(got.Timestamp),
+		"timestamp must round-trip exactly")
+}
diff --git a/tests/e2e/filestats_dashboard_test.go b/tests/e2e/filestats_dashboard_test.go
new file mode 100644
index 0000000..50a7efd
--- /dev/null
+++ b/tests/e2e/filestats_dashboard_test.go
@@ -0,0 +1,136 @@
+//go:build e2e
+
+package e2e_test
+
+// Acceptance tests for specs/filestats/SPEC.md — Feature 3 (Visual Dashboard).
+
+import (
+	"context"
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+)
+
+// ---------------------------------------------------------------------------
+// helpers
+// ---------------------------------------------------------------------------
+
+// renderPlotDir runs static analysis and emits plot pages to a temp dir.
+func renderPlotDir(t *testing.T, fileCount int) string {
+	t.Helper()
+
+	dir := fixtureDir(t, fileCount)
+	outputDir := filepath.Join(t.TempDir(), "reports")
+	svc := newStaticService()
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	names := make([]string, 0, len(results))
+	for n := range results {
+		names = append(names, n)
+	}
+
+	require.NoError(t, svc.FormatPlotPages(names, results, outputDir))
+
+	return outputDir
+}
+
+// ---------------------------------------------------------------------------
+// FR-3.3: index.html
+// ---------------------------------------------------------------------------
+
+func TestDashboard_IndexHTMLExists(t *testing.T) {
+	t.Parallel()
+
+	outputDir := renderPlotDir(t, 5)
+
+	data, err := os.ReadFile(filepath.Join(outputDir, "index.html"))
+	require.NoError(t, err, "index.html must exist")
+	assert.Contains(t, string(data), "<html")
+}
+
+// ---------------------------------------------------------------------------
+// FR-3.1: New chart types
+// ---------------------------------------------------------------------------
+
+// TestDashboard_ContributorWorkloadPage validates that the devs store plot
+// section renderer is registered and produces sections from sample data.
+func TestDashboard_ContributorWorkloadPage(t *testing.T) {
+	t.Parallel()
+
+	// The devs analyzer registers a store-based plot section renderer.
+	// Verify the registration exists — this is the prerequisite for
+	// generating devs chart pages when history analysis runs with --format plot.
+	storeFn := analyze.StorePlotSectionsFor("devs")
+	assert.NotNil(t, storeFn,
+		"devs store plot section renderer must be registered")
+}
+
+// TestDashboard_CouplingHeatmapPage validates that the couples plot section
+// renderer is registered. The couples analyzer already produces a developer
+// coupling heatmap via go-echarts HeatMap.
+func TestDashboard_CouplingHeatmapPage(t *testing.T) {
+	t.Parallel()
+
+	// The couples analyzer registers a store-based plot section renderer.
+	storeFn := analyze.StorePlotSectionsFor("couples")
+	assert.NotNil(t, storeFn,
+		"couples store plot section renderer must be registered")
+}
+
+// ---------------------------------------------------------------------------
+// FR-3.5: report.json
+// ---------------------------------------------------------------------------
+
+func TestDashboard_ReportJSONEmitted(t *testing.T) {
+	t.Parallel()
+
+	outputDir := renderPlotDir(t, 5)
+
+	data, err := os.ReadFile(filepath.Join(outputDir, "report.json"))
+	if !assert.NoError(t, err, "report.json must be emitted alongside charts") {
+		return
+	}
+
+	var parsed jsonObj
+	assert.NoError(t, json.Unmarshal(data, &parsed), "report.json must be valid JSON")
+}
+
+// ---------------------------------------------------------------------------
+// AC: all HTML files well-formed
+// ---------------------------------------------------------------------------
+
+func TestDashboard_HTMLWellFormed(t *testing.T) {
+	t.Parallel()
+
+	outputDir := renderPlotDir(t, 5)
+
+	entries, err := os.ReadDir(outputDir)
+	require.NoError(t, err)
+
+	htmlCount := 0
+
+	for _, e := range entries {
+		if !strings.HasSuffix(e.Name(), ".html") {
+			continue
+		}
+
+		htmlCount++
+
+		data, err := os.ReadFile(filepath.Join(outputDir, e.Name()))
+		require.NoError(t, err)
+		content := string(data)
+		assert.Contains(t, content, "<html", "%s must have <html", e.Name())
+		assert.Contains(t, content, "</html>", "%s must close </html>", e.Name())
+	}
+
+	assert.Greater(t, htmlCount, 0, "at least one HTML page must be generated")
+}
diff --git a/tests/e2e/filestats_perfile_test.go b/tests/e2e/filestats_perfile_test.go
new file mode 100644
index 0000000..8bbddf6
--- /dev/null
+++ b/tests/e2e/filestats_perfile_test.go
@@ -0,0 +1,241 @@
+//go:build e2e
+
+package e2e_test
+
+// Acceptance tests for specs/filestats/SPEC.md — Feature 1 (Per-File Output).
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"sort"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// ---------------------------------------------------------------------------
+// Baseline: current schema (must stay green)
+// ---------------------------------------------------------------------------
+
+func TestPerFile_DefaultOutput_MatchesCurrentSchema(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 5)
+	report := runStaticJSON(t, newStaticService(), dir)
+
+	// Top-level keys.
+	assert.Contains(t, report, "overall_score")
+	assert.Contains(t, report, "overall_score_label")
+	_, hasTitle := report["title"]
+	assert.False(t, hasTitle, "top-level 'title' must NOT exist in JSONReport")
+
+	// One section per analyzer.
+	secs := jSections(t, report)
+	want := []string{"COHESION", "COMMENTS", "COMPLEXITY", "HALSTEAD", "IMPORTS"}
+	got := make([]string, 0, len(secs))
+	for _, s := range secs {
+		if t, ok := s["title"].(string); ok {
+			got = append(got, t)
+		}
+	}
+	sort.Strings(got)
+	assert.Equal(t, want, got)
+
+	// Each section has standard fields.
+	for _, s := range secs {
+		title, _ := s["title"].(string)
+		for _, key := range []string{"score", "score_label", "status", "metrics", "issues"} {
+			assert.Contains(t, s, key, "%s must have %q", title, key)
+		}
+	}
+}
+
+// ---------------------------------------------------------------------------
+// Per-file output: files[] array
+// ---------------------------------------------------------------------------
+
+func TestPerFile_FilesArray(t *testing.T) {
+	t.Parallel()
+
+	const n = 5
+
+	dir := fixtureDir(t, n)
+	report := runStaticJSON(t, newPerFileStaticService(), dir)
+
+	for _, s := range jSections(t, report) {
+		title, _ := s["title"].(string)
+
+		files := jArray(s, "files")
+		if !assert.NotNil(t, files,
+			"%s: section must have 'files' key with --per-file", title) {
+			continue
+		}
+
+		assert.Len(t, files, n, "%s: files[] length must equal source file count", title)
+	}
+}
+
+func TestPerFile_FileEntrySchema(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 3)
+	report := runStaticJSON(t, newPerFileStaticService(), dir)
+
+	required := []string{"file_path", "score", "score_label", "status", "metrics", "issues"}
+
+	for _, s := range jSections(t, report) {
+		title, _ := s["title"].(string)
+
+		files := jArray(s, "files")
+		if !assert.NotEmpty(t, files,
+			"%s: files[] must be non-empty with --per-file", title) {
+			continue
+		}
+
+		for i, raw := range files {
+			entry, ok := raw.(jsonObj)
+			if !assert.True(t, ok, "%s: files[%d] must be object", title, i) {
+				continue
+			}
+			for _, key := range required {
+				assert.Contains(t, entry, key, "%s: files[%d] must have %q", title, i, key)
+			}
+		}
+	}
+}
+
+func TestPerFile_FilePathsRelative(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 3)
+	report := runStaticJSON(t, newPerFileStaticService(), dir)
+
+	for _, s := range jSections(t, report) {
+		title, _ := s["title"].(string)
+
+		files := jArray(s, "files")
+		if !assert.NotEmpty(t, files,
+			"%s: files[] must be non-empty with --per-file", title) {
+			continue
+		}
+
+		for _, raw := range files {
+			entry, _ := raw.(jsonObj)
+			fp, _ := entry["file_path"].(string)
+			assert.False(t, filepath.IsAbs(fp),
+				"%s: file_path must be relative, got %q", title, fp)
+		}
+	}
+}
+
+// ---------------------------------------------------------------------------
+// Per-file output: IMPORTS (info-only, score -1)
+// ---------------------------------------------------------------------------
+
+func TestPerFile_ImportsInfoOnly(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 3)
+	report := runStaticJSON(t, newPerFileStaticService(), dir)
+	imp := jSectionByTitle(t, jSections(t, report), "IMPORTS")
+
+	score, _ := jFloat(imp["score"])
+	assert.InDelta(t, -1.0, score, 0.001, "IMPORTS score must be -1")
+
+	files := jArray(imp, "files")
+	if !assert.NotNil(t, files, "IMPORTS must have files[]") {
+		return
+	}
+
+	for i, fRaw := range files {
+		fm, _ := fRaw.(jsonObj)
+		fp, _ := fm["file_path"].(string)
+		assert.NotEmpty(t, fp, "IMPORTS files[%d] must have file_path", i)
+
+		for j, iRaw := range jArray(fm, "issues") {
+			issue, _ := iRaw.(jsonObj)
+			loc, _ := issue["location"].(string)
+			assert.NotEmpty(t, loc, "IMPORTS files[%d].issues[%d].location must be set", i, j)
+		}
+	}
+}
+
+// ---------------------------------------------------------------------------
+// Edge cases
+// ---------------------------------------------------------------------------
+
+func TestPerFile_EmptyDir(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 0)
+	report := runStaticJSON(t, newPerFileStaticService(), dir)
+
+	for _, s := range jSections(t, report) {
+		title, _ := s["title"].(string)
+		files := jArray(s, "files")
+		assert.NotNil(t, files, "%s: files key must exist even for empty dir", title)
+		assert.Empty(t, files, "%s: files[] must be empty for empty dir", title)
+	}
+}
+
+func TestPerFile_BinaryOnlyDir(t *testing.T) {
+	t.Parallel()
+
+	dir := t.TempDir()
+	require.NoError(t, os.WriteFile(
+		filepath.Join(dir, "data.bin"), []byte{0x00, 0xFF, 0xFE}, 0o600))
+
+	svc := newStaticService()
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err, "must not crash on binary-only dir")
+	_ = results
+}
+
+// ---------------------------------------------------------------------------
+// Performance
+// ---------------------------------------------------------------------------
+
+func TestPerFile_Performance_Within2xBaseline(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 50)
+
+	measure := func() time.Duration {
+		svc := newPerFileStaticService()
+		start := time.Now()
+		_, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+		require.NoError(t, err)
+		return time.Since(start)
+	}
+
+	baseline := measure()
+	perFile := measure()
+
+	t.Logf("baseline=%v per-file=%v", baseline, perFile)
+	assert.LessOrEqual(t, perFile, 2*baseline,
+		"per-file (%v) must be ≤ 2x baseline (%v)", perFile, baseline)
+}
+
+// ---------------------------------------------------------------------------
+// Format composability (FR-1.7)
+// ---------------------------------------------------------------------------
+
+func TestPerFile_ComposableWithTextAndCompact(t *testing.T) {
+	t.Parallel()
+
+	dir := fixtureDir(t, 3)
+	svc := newPerFileStaticService()
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err)
+
+	// Must not crash in any format.
+	require.NoError(t, svc.FormatText(results, false, true, nopWriter{}))
+	require.NoError(t, svc.FormatCompact(results, true, nopWriter{}))
+}
+
+type nopWriter struct{}
+
+func (nopWriter) Write(p []byte) (int, error) { return len(p), nil }
diff --git a/tests/e2e/helpers_test.go b/tests/e2e/helpers_test.go
new file mode 100644
index 0000000..399d2ce
--- /dev/null
+++ b/tests/e2e/helpers_test.go
@@ -0,0 +1,222 @@
+//go:build e2e
+
+package e2e_test
+
+import (
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"math"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/stretchr/testify/require"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/analyze"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/cohesion"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/comments"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/complexity"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/halstead"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/imports"
+)
+
+// ---------------------------------------------------------------------------
+// Service factory
+// ---------------------------------------------------------------------------
+
+// allStaticAnalyzers returns the full set of static analyzers.
+func allStaticAnalyzers() []analyze.StaticAnalyzer {
+	return []analyze.StaticAnalyzer{
+		complexity.NewAnalyzer(),
+		comments.NewAnalyzer(),
+		halstead.NewAnalyzer(),
+		cohesion.NewAnalyzer(),
+		imports.NewAnalyzer(),
+	}
+}
+
+// newStaticService creates a StaticService wired for e2e testing:
+// all analyzers, real renderer, no native memory ops.
+func newStaticService() *analyze.StaticService {
+	svc := analyze.NewStaticService(allStaticAnalyzers(), nil)
+	svc.Renderer = &renderer.DefaultStaticRenderer{}
+	svc.NativeMemoryReleaseFn = func() {}
+
+	return svc
+}
+
+// newPerFileStaticService creates a StaticService with per-file mode enabled.
+func newPerFileStaticService() *analyze.StaticService {
+	svc := newStaticService()
+	svc.PerFile = true
+
+	return svc
+}
+
+// ---------------------------------------------------------------------------
+// Fixture builder
+// ---------------------------------------------------------------------------
+
+// fixtureDir creates a temp directory with n Go source files.
+// Each file has 4 functions whose cyclomatic complexity scales with the
+// file index, producing non-uniform metric distributions across files.
+// All files import "fmt" so the imports analyzer has data.
+func fixtureDir(t *testing.T, n int) string {
+	t.Helper()
+
+	dir := t.TempDir()
+
+	for i := range n {
+		var b strings.Builder
+		fmt.Fprintf(&b, "package fixture\n\nimport \"fmt\"\n\n")
+
+		for j := range 4 {
+			fmt.Fprintf(&b, "func F%d_%d(a, b int) int {\n\tx := a + b\n", i, j)
+			for k := range i + 1 {
+				fmt.Fprintf(&b, "\tif x > %d {\n\t\tx += %d\n\t}\n", k, k)
+			}
+			fmt.Fprintf(&b, "\tfmt.Println(x)\n\treturn x\n}\n\n")
+		}
+
+		path := filepath.Join(dir, fmt.Sprintf("file%04d.go", i))
+		require.NoError(t, os.WriteFile(path, []byte(b.String()), 0o600))
+	}
+
+	return dir
+}
+
+// ---------------------------------------------------------------------------
+// JSON helpers
+// ---------------------------------------------------------------------------
+
+// jsonObj is a convenience alias for navigating parsed JSON.
+type jsonObj = map[string]any
+
+// runStaticJSON runs all static analyzers on dir and returns parsed JSON.
+func runStaticJSON(t *testing.T, svc *analyze.StaticService, dir string) jsonObj {
+	t.Helper()
+
+	results, err := svc.AnalyzeFolder(context.Background(), dir, nil)
+	require.NoError(t, err, "AnalyzeFolder")
+
+	var buf bytes.Buffer
+	require.NoError(t, svc.FormatJSON(results, &buf), "FormatJSON")
+
+	var out jsonObj
+	require.NoError(t, json.Unmarshal(buf.Bytes(), &out), "JSON parse")
+
+	return out
+}
+
+// jSections extracts the "sections" array from a top-level report.
+func jSections(t *testing.T, report jsonObj) []jsonObj {
+	t.Helper()
+
+	raw, ok := report["sections"]
+	require.True(t, ok, `top-level "sections" key must exist`)
+
+	arr, ok := raw.([]any)
+	require.True(t, ok, `"sections" must be an array`)
+
+	out := make([]jsonObj, 0, len(arr))
+	for _, v := range arr {
+		m, mOK := v.(jsonObj)
+		require.True(t, mOK, "each section must be an object")
+		out = append(out, m)
+	}
+
+	return out
+}
+
+// jSectionByTitle finds a section by its "title" field.
+func jSectionByTitle(t *testing.T, secs []jsonObj, title string) jsonObj {
+	t.Helper()
+
+	for _, s := range secs {
+		if s["title"] == title {
+			return s
+		}
+	}
+
+	t.Fatalf("section %q not found", title)
+
+	return nil
+}
+
+// jArray extracts a JSON array by key, returning nil (not fatal) if absent.
+func jArray(obj jsonObj, key string) []any {
+	raw, ok := obj[key]
+	if !ok {
+		return nil
+	}
+
+	arr, ok := raw.([]any)
+	if !ok {
+		return nil
+	}
+
+	return arr
+}
+
+// jMetricLabels returns sorted metric labels from a section's "metrics" array.
+func jMetricLabels(section jsonObj) []string {
+	arr := jArray(section, "metrics")
+	labels := make([]string, 0, len(arr))
+
+	for _, v := range arr {
+		m, _ := v.(jsonObj)
+		if l, ok := m["label"].(string); ok {
+			labels = append(labels, l)
+		}
+	}
+
+	return labels
+}
+
+// jFloat extracts a float64 from a JSON value.
+func jFloat(v any) (float64, bool) {
+	switch n := v.(type) {
+	case float64:
+		return n, true
+	case json.Number:
+		f, err := n.Float64()
+		return f, err == nil
+	}
+
+	return 0, false
+}
+
+// parseMetricValue parses a metric "value" string (e.g. "1,234") as float64.
+func parseMetricValue(v any) (float64, bool) {
+	s, ok := v.(string)
+	if !ok {
+		return jFloat(v)
+	}
+
+	cleaned := strings.NewReplacer(",", "", "%", "", " ", "").Replace(s)
+
+	var f float64
+	if _, err := fmt.Sscanf(cleaned, "%f", &f); err != nil {
+		return math.NaN(), false
+	}
+
+	return f, true
+}
+
+// avg computes the arithmetic mean of a float slice.
+func avg(vals []float64) float64 {
+	if len(vals) == 0 {
+		return 0
+	}
+
+	sum := 0.0
+	for _, v := range vals {
+		sum += v
+	}
+
+	return sum / float64(len(vals))
+}
diff --git a/tests/e2e/main_test.go b/tests/e2e/main_test.go
new file mode 100644
index 0000000..f8eadb5
--- /dev/null
+++ b/tests/e2e/main_test.go
@@ -0,0 +1,35 @@
+//go:build e2e
+
+// Package e2e_test contains end-to-end acceptance tests for codefang features.
+//
+// Tests are organized by feature spec — one file per spec or feature area.
+// They exercise real analysis on real source files and assert the output
+// contract. New specs add new *_test.go files; shared infrastructure lives
+// in helpers_test.go.
+//
+// Build tag: e2e (excluded from `go test ./...` by default).
+//
+// Run all e2e tests:
+//
+//	make test-e2e
+//
+// Run a specific feature:
+//
+//	make test-e2e RUN=TestPerFile
+package e2e_test
+
+import (
+	"os"
+	"testing"
+
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/common/renderer"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/couples"
+	"github.com/Sumatoshi-tech/codefang/internal/analyzers/devs"
+)
+
+func TestMain(m *testing.M) {
+	renderer.RegisterPlotRenderer()
+	devs.RegisterDevPlotSections()
+	couples.RegisterPlotSections()
+	os.Exit(m.Run())
+}
diff --git a/tools/lexgen/lexgen.go b/tools/lexgen/lexgen.go
index 91ff42a..16b4aae 100644
--- a/tools/lexgen/lexgen.go
+++ b/tools/lexgen/lexgen.go
@@ -31,38 +31,38 @@ const (
 
 // targetLanguages are the languages we embed. Keeps binary size reasonable.
 var targetLanguages = map[string]string{
-	"ru":  "Russian",
-	"zh":  "Chinese",
-	"ja":  "Japanese",
-	"ko":  "Korean",
-	"es":  "Spanish",
-	"fr":  "French",
-	"de":  "German",
-	"pt":  "Portuguese",
-	"it":  "Italian",
-	"nl":  "Dutch",
-	"pl":  "Polish",
-	"sv":  "Swedish",
-	"cs":  "Czech",
-	"tr":  "Turkish",
-	"ar":  "Arabic",
-	"hi":  "Hindi",
-	"th":  "Thai",
-	"vi":  "Vietnamese",
-	"uk":  "Ukrainian",
-	"fi":  "Finnish",
-	"da":  "Danish",
-	"no":  "Norwegian",
-	"el":  "Greek",
-	"hu":  "Hungarian",
-	"ro":  "Romanian",
-	"bg":  "Bulgarian",
-	"hr":  "Croatian",
-	"sk":  "Slovak",
-	"he":  "Hebrew",
-	"id":  "Indonesian",
-	"ms":  "Malay",
-	"fa":  "Persian",
+	"ru": "Russian",
+	"zh": "Chinese",
+	"ja": "Japanese",
+	"ko": "Korean",
+	"es": "Spanish",
+	"fr": "French",
+	"de": "German",
+	"pt": "Portuguese",
+	"it": "Italian",
+	"nl": "Dutch",
+	"pl": "Polish",
+	"sv": "Swedish",
+	"cs": "Czech",
+	"tr": "Turkish",
+	"ar": "Arabic",
+	"hi": "Hindi",
+	"th": "Thai",
+	"vi": "Vietnamese",
+	"uk": "Ukrainian",
+	"fi": "Finnish",
+	"da": "Danish",
+	"no": "Norwegian",
+	"el": "Greek",
+	"hu": "Hungarian",
+	"ro": "Romanian",
+	"bg": "Bulgarian",
+	"hr": "Croatian",
+	"sk": "Slovak",
+	"he": "Hebrew",
+	"id": "Indonesian",
+	"ms": "Malay",
+	"fa": "Persian",
 }
 
 type lexEntry struct {