Skip to content

Remove confirmed-dead scripts, defaults, and config keys#1208

Open
trvrb wants to merge 2 commits into
masterfrom
cleanup-dead-code
Open

Remove confirmed-dead scripts, defaults, and config keys#1208
trvrb wants to merge 2 commits into
masterfrom
cleanup-dead-code

Conversation

@trvrb

@trvrb trvrb commented Jun 27, 2026

Copy link
Copy Markdown
Member

First in a series of small, independently-reviewable PRs cleaning up pandemic-era cruft (see the cleanup plan; scope: dead code + proximity removal + docs/schemes).

Motivation

The repo accumulated scripts, default data files, and config keys that nothing references anymore. This PR removes the verified-dead ones. The three live contracts — weekly OPEN build, occasional GISAID builds, and external users running their own builds — are unaffected.

What's removed

  • Orphan scripts (0 refs in workflow/ Snakefile nextstrain_profiles/ docs/ tests/): scripts/add_labels.py, scripts/generate-scientific-credits.py, and the explicitly-deprecated scripts/deprecated/ (calculate_delta_frequency.py, parse_mutational_fitness_tsv_into_distance_map.py).
  • Unused defaults: defaults/distance_maps/VoC.json (only S1.json is used by the distances rule), defaults/clade_hierarchy.tsv, defaults/clades_who.tsv.
  • Dead config key files.outgroup in defaults/parameters.yaml — pointed at a file that doesn't exist in the repo and was read nowhere; its config-reference entry (already documented "No longer used") is removed too.
  • Deprecated my_profiles/ directory (only a deprecation README) and its now-orphan .gitignore exception.
  • Unused committed example data: data/example_*_worldwide.*, data/example_*_aus.*, data/example_multiple_inputs.tar.xz. The CI-used example_metadata.tsv / example_sequences.fasta.gz are kept.

Verification

  • Every removed path was confirmed to have zero references across workflow/, Snakefile, nextstrain_profiles/, docs/, and tests/ (accounting for config-key indirection, which hides references behind config["files"][...]).
  • snakemake --profile nextstrain_profiles/nextstrain-ci -n builds an unchanged 37-job DAG with no errors.

Test plan

  • CI green.

🤖 Generated with Claude Code

trvrb and others added 2 commits June 26, 2026 20:14
First pass of a pandemic-era cruft cleanup. Removes code and data that is
referenced by nothing in the workflow, the active profiles, the docs, or the
tests (verified by grep plus a dry-run of the CI profile showing an unchanged
37-job DAG). No behavior change to any build.

Removed:
- Orphan scripts: scripts/add_labels.py, scripts/generate-scientific-credits.py,
  and the explicitly-deprecated scripts/deprecated/ (calculate_delta_frequency.py,
  parse_mutational_fitness_tsv_into_distance_map.py).
- Unused defaults: defaults/distance_maps/VoC.json (only S1.json is used),
  defaults/clade_hierarchy.tsv, defaults/clades_who.tsv.
- Dead config key files.outgroup (defaults/parameters.yaml) — it pointed at a
  file that does not exist in the repo and was read nowhere; its config-reference
  entry (already documented "No longer used") is removed too.
- Deprecated my_profiles/ directory (only a deprecation README) and its now-orphan
  .gitignore exception.
- Unused committed example data: data/example_*_worldwide.*, data/example_*_aus.*,
  data/example_multiple_inputs.tar.xz (the CI-used example_metadata.tsv /
  example_sequences.fasta.gz are kept).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant