Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
2776906
feat: v4 bioRxiv resubmission — prioritization framework framing + Ti…
Mar 9, 2026
642b149
feat: discordance analysis — ARCHCODE vs VEP/CADD 2x2 matrix + Q2a/Q2…
Mar 9, 2026
da4a9cc
feat: freeze v4 submission-ready package — external validations + int…
Mar 9, 2026
a09e1b2
feat: low-hanging-fruits — trust amplifiers + collaborator brief PDF
Mar 9, 2026
324619a
feat: structural fragility atlas + pharmacological parameter sweep
Mar 9, 2026
61c787f
feat: BRCA1 fragility atlas + multi-locus BET inhibitor sweep
Mar 9, 2026
99d75bc
docs: ARCHCODE next-step research plan — 5 planning documents
Mar 9, 2026
2f91172
feat: EXP-001 ablation study + EXP-002 leave-one-locus-out (P0 experi…
Mar 9, 2026
b2a26d7
feat: EXP-003 tissue-mismatch controls + EXP-004 threshold robustness…
Mar 9, 2026
e9435f9
feat: EXP-006 contact metric robustness + EXP-008 Gasperini CRISPRi b…
Mar 9, 2026
3297113
feat: mechanistic taxonomy framework — 5 classes of regulatory pathog…
Mar 9, 2026
25086ff
docs: expand external casebook with 3 verified references (8 total)
Mar 9, 2026
3856451
fix: correct Gröschel 2014 DOI (.023 → .019) — verified via CrossRef
Mar 9, 2026
b5db6ce
docs: taxonomy paper abstract + significance statement draft
Mar 9, 2026
cf04e98
feat: taxonomy auto-assignment pipeline — 261 variants classified
Mar 9, 2026
0a9dce8
analysis: ARS-taxonomy overlay — inconclusive (N=2 fragility atlases)
Mar 9, 2026
1116189
docs: related work — 10 verified papers for taxonomy paper intro/disc…
Mar 9, 2026
f914270
docs: HBB top-5 archetypal Class B variants (E3 deliverable)
Mar 9, 2026
9d991be
feat: manuscript sections 7-9 — experiments, product, discussion
Mar 9, 2026
27ce32d
feat: collaborator talk slide — 3-panel taxonomy summary figure
Mar 9, 2026
53cdeff
feat: manuscript sections 1-3 — intro, tool conflation, taxonomy
Mar 9, 2026
f30a71a
feat: manuscript sections 4-6 — ARCHCODE engine, case studies, contra…
Mar 9, 2026
924173d
docs: resolve all 6 [CHECK] markers — 3 text fixes, 3 editorial OK
Mar 9, 2026
c96c9f1
feat: full manuscript draft assembled — 9 sections, 23 references, 3 …
Mar 9, 2026
39d13d6
feat: gnomAD constraint × taxonomy — tissue match dominates over evol…
Mar 9, 2026
4cea40a
feat: GWAS Catalog overlay + SCN5A cardiac tissue-matched config
Mar 10, 2026
10a02b9
feat: SCN5A cardiac tissue-match experiment — Class E→B conversion co…
Mar 10, 2026
673ba1d
docs: add 3 supplementary sections — gnomAD constraint, GWAS overlay,…
Mar 10, 2026
ffa3edd
feat: SCN5A K562 vs cardiac comparison figure (Supplementary Figure S3)
Mar 10, 2026
2807384
feat: taxonomy paper Typst conversion + cross-locus summary figure (F…
Mar 10, 2026
ea669cb
feat: permutation test (p<0.0001) + review response plan + paper-crit…
Mar 10, 2026
59f22c4
fix: P0 reviewer response — NMI correction, tissue_match algorithm, c…
Mar 10, 2026
41b713c
feat: P1 reviewer response — effect sizes, model clarification, confi…
Mar 10, 2026
9f23ea0
feat: strengthen positions — tissue-match amplification, MPRA control…
Mar 10, 2026
08fac40
feat: HUDEP-2 Capture Hi-C contact extraction at Q2b positions
Mar 10, 2026
a82008d
feat: cross-locus MPRA overlay — Kircher 2019 × ARCHCODE (3 loci)
Mar 10, 2026
c51f0b3
feat: claim ladder with explicit evidence tiers and downgrade rules
Mar 10, 2026
6c2d3ea
feat: 15-locus discovery ranking — composite scoring + Supplementary …
Mar 10, 2026
b29d3e8
docs: ENCODE tissue data availability report — 12 loci surveyed (8 fu…
Mar 10, 2026
86b2d81
feat: MLH1 HCT116 tissue-match experiment — tail amplification 2.0× a…
Mar 10, 2026
ba1e224
docs: update manuscript with MLH1 HCT116 results — 4 loci in tissue-m…
Mar 10, 2026
10d11a7
feat: CFTR A549 tissue-match — reverse effect (0.60×) reveals baselin…
Mar 10, 2026
da195ed
docs: integrity audit report (21/21 MATCH) + activeContext update
Mar 10, 2026
66098f0
feat: TERT SK-N-SH tissue-match — strong reverse effect (0.39×) confi…
Mar 10, 2026
73d991a
docs: update manuscript with TERT SK-N-SH — 6-locus tissue-match pane…
Mar 10, 2026
1623427
feat: TP53 IMR-90 tissue-match — strongest reverse (0.18×), distant e…
Mar 10, 2026
0bb240d
docs: 7-locus tissue-match panel in manuscript — 4 outcome modes, 3 r…
Mar 10, 2026
a379103
fix: figure caption "five loci" → "seven loci" — submission readiness…
Mar 10, 2026
c523e01
docs: submission prep — abstract updated with 7-locus panel + cover l…
Mar 10, 2026
4a3c980
feat: EXP-009 cohesin loading mode ablation — Anderson et al. 2026 va…
Mar 13, 2026
b3c9cea
docs: add Brown 2018 + Zuin 2022 citations — topology≠expression + no…
Mar 14, 2026
edb2eb2
feat: AlphaMissense orthogonality analysis — 10th independent method …
Mar 18, 2026
5f4c755
fix: align README with taxonomy paper canonical numbers
Mar 18, 2026
07e5ef8
feat: V1 3D-aware variant scoring — ML ablation confirms structural f…
Mar 19, 2026
8431e49
docs: add ML ablation study (Section S6) — structural features essent…
Mar 19, 2026
a2a6d6d
feat: add FOXP3 as 15th ARCHCODE locus — immunology track (Nobel 2025…
Mar 24, 2026
7374b78
feat(foxp3): 60kb focused window — IPEX-like paradox established
Mar 24, 2026
c212f2d
feat(foxp3): in silico saturation mutagenesis — 2 structural hotspots…
Mar 24, 2026
2a04836
docs(foxp3): orthogonal validation of in silico mutagenesis hotspots
Mar 24, 2026
c73d23f
docs: add FOXP3 case study + predictive mapping to taxonomy paper
Mar 24, 2026
b9fa2f1
docs: add FOXP3 mutagenesis to abstract — predictive vulnerability ma…
Mar 24, 2026
0929362
feat(bcl11a): erythroid enhancer mutagenesis — DHS +58 validated as C…
Mar 24, 2026
95e6e7f
docs: add BCL11A Casgevy case study to taxonomy paper (Section 5)
Mar 24, 2026
c5a6243
feat(bcl11a): GWAS validation + verified DHS coordinates + GATA1 motif
Mar 24, 2026
765d869
docs: add BCL11A/Casgevy + GWAS validation to abstract
Mar 24, 2026
3d3c9ed
feat: SCN5A cardiac mutagenesis + PAX6/HBA1 baseline configs
Mar 24, 2026
af46e16
feat(hba1): 90kb focused window — same Δ as 300kb, needs mutagenesis
Mar 24, 2026
802d42b
feat(hba1): in silico mutagenesis — hotspot at chr16:181,487 (LSSIM=0…
Mar 24, 2026
58a1c34
feat(bcl11a): CTCF boundary deletion experiment — enhancer hijacking …
Mar 24, 2026
1cf8fb8
fix: use LOCUS_ARG in output filenames — prevent result overwrites
Mar 24, 2026
cfafddb
docs: add 14 verified references, multi-locus atlas table S7, fix pha…
Mar 28, 2026
f01c0a3
fix(integrity): remove AlphaGenome mock system — eliminate synthetic …
Mar 30, 2026
859b1e2
feat: AlphaGenome real API validation — pearls show 5.5× more CAGE di…
Mar 30, 2026
b346083
feat: 3-way validation + ISM + MPRA cross-validation for pearl hotspot
Mar 30, 2026
0684756
docs: align README with actual data — fix pearl counts, update AlphaG…
Mar 30, 2026
a830c1b
fix(integrity): resolve 6 audit findings from repo-wide scan
Mar 30, 2026
52b3bba
fix(integrity): complete remaining audit findings
Mar 30, 2026
5a2a97f
feat(integrity): add publication release contracts
Mar 30, 2026
536b5ff
docs(readme): align public framing with VALIDATION_PROTOCOL — discove…
Mar 30, 2026
60c44c7
feat(manuscript): add AlphaGenome CAGE validation + ISM + ablation ca…
Mar 30, 2026
f8e7667
fix(integrity): audit cleanup — resolve 10 cross-document discrepancies
Mar 31, 2026
69ee680
chore: repo cleanup — add missing data, remove temp files, update git…
Mar 31, 2026
ffed1b0
docs: add Skeptic Engine independent validation results for HBB atlas
Apr 1, 2026
75e5908
feat(stats): add statistical strengthening + cross-locus Pearl scan +…
Apr 1, 2026
330f64e
feat(validation): AlphaGenome CAGE batch on 7 loci — honest mixed result
Apr 1, 2026
2d64a72
docs: align submission status + add endorsement packet
Apr 1, 2026
7f0ae91
docs: update session context + endorser tracking (5 emails sent)
Apr 1, 2026
2d0043c
docs: end-of-session context update — 5 endorser emails sent, Nora fo…
Apr 1, 2026
3abe92d
feat: establish three-layer canon and archive superseded public surfaces
Apr 4, 2026
0f39fc0
feat: add BCL11A Casgevy bridge evidence pack
Apr 4, 2026
448956a
fix: resolve post-canon semantic drift in competitor, BCL11A, and val…
Apr 4, 2026
b3d6dc0
fix(ci): repair gitleaks checkout depth and restore coverage gate
Apr 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
26 changes: 26 additions & 0 deletions .agents/skills/implementer/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Skill: implementer

Use when:

- user approved a concrete plan
- required edits are clear and bounded

Workflow:

1. Apply edits exactly within approved scope.
2. Keep changes minimal and reversible.
3. Avoid touching unrelated files.
4. Run planned verification commands.
5. Report `Implemented` and `Verified` separately.

Output contract:

- files changed
- behavioral changes
- verification commands and outcomes
- unresolved items

Hard rules:

- if scope drift appears, stop and return to planning
- do not mark verified without command evidence
29 changes: 29 additions & 0 deletions .agents/skills/planner/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Skill: planner

Use when:

- task affects 3+ files
- requirements are ambiguous
- change can impact contracts, data, or security

Workflow:

1. Build dependency map (files/modules/docs/contracts).
2. Read relevant context.
3. Enumerate risks.
4. Produce execution plan with rollback and verification.
5. Stop for explicit approval.

Output contract:

- `Context analyzed`
- `Risks`
- `Plan`
- `Rollback`
- `Verification`
- `Approval gate`

Hard rules:

- no implementation edits before approval
- no unverifiable promises
33 changes: 33 additions & 0 deletions .agents/skills/reviewer/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Skill: reviewer

Use when:

- user asks for review
- changes are ready for gate decision

Two-pass review:

1. Spec compliance

- does implementation match request
- are contracts preserved
- are edge cases addressed

2. Quality and risk

- regressions
- tests and evidence sufficiency
- security and secret hygiene

Verdict:

- `READY`
- `NEEDS_FIXES`
- `BLOCKED`

Output contract:

- findings first, ordered by severity
- exact file references
- missing tests/evidence
- final verdict
35 changes: 35 additions & 0 deletions .agents/skills/security-check/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Skill: security-check

Use when:

- changes touch config, auth, network, data export, CI/CD, secrets, or logs
- preparing for merge/release

Checklist:

1. Secret hygiene

- no real keys/tokens/passwords in repo
- no secrets in command output or docs examples
- `.env` excluded and examples sanitized

2. Data and privacy hygiene

- no unintended PII exposure
- logs avoid sensitive payloads

3. Supply-chain and command safety

- avoid destructive commands
- verify dependency/tooling changes are intentional

4. Evidence

- provide command-based checks and paths to findings

Output contract:

- `Security findings`
- `Risk level`
- `Required fixes`
- `Verification evidence`
5 changes: 3 additions & 2 deletions .claude/agents/integrity-checker.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,8 @@ RECOMMENDATION:
| Positional signal | `results/positional_signal_{locus}.json` | logistic_regression.auc_improvement, .lr_p_value, .interpretation |
| Hi-C correlation | `results/hic_correlation_{locus}.json` | pearson_r or K562.r, MCF7.r |
| TDA analysis | `results/tda_proof_of_concept_{locus}.json` | rank_correlations.ssim_vs_wasserstein_h1.rho |
| Manuscript | `manuscript/FULL_MANUSCRIPT.md` | Table 6, inline numbers, references |
| Manuscript (arXiv) | `manuscript/main.typ` + `manuscript/body_content.typ` | Table 6, inline numbers, references |
| Manuscript (bioRxiv)| `manuscript/biorxiv_version/main.typ` | Biology-first framing, same data |

## Priority

Expand All @@ -143,4 +144,4 @@ If multiple issues found, fix in this order:

---

_Agent created: 2026-03-01 | ARCHCODE v2.3_
_Agent created: 2026-03-01 | Updated: 2026-03-09 | ARCHCODE v2.16_
193 changes: 193 additions & 0 deletions .claude/agents/paper-critic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,193 @@
---
name: paper-critic
description: Devil's advocate reviewer for ARCHCODE manuscripts. Analyzes claims against actual data, identifies weaknesses, circular reasoning, overclaims, and missing evidence. Returns actionable objection-response pairs grounded in project files. Invoke before submission or when processing external reviewer feedback.
tools: Read, Grep, Glob, Bash(python:*)
model: opus
permissionMode: default
---

# Paper Critic Agent — "Адвокат дьявола с доступом к данным"

You are a ruthlessly honest scientific critic for the ARCHCODE project. Your role: find every weakness in our manuscripts BEFORE external reviewers do, and — critically — propose concrete fixes grounded in real data files.

## Prime Directive

**"No claim without evidence. No defense without data."**

You NEVER:
- Invent data or results to defend a claim
- Suggest softening language to hide a real weakness
- Dismiss valid criticism as "minor"
- Hallucinate file contents — READ every file before citing it

You ALWAYS:
- Read actual data files (JSON, CSV, results/) before evaluating any claim
- Distinguish between claims we CAN defend (with existing data) and claims we CANNOT
- Propose concrete fixes: script to run, analysis to add, text to rewrite
- Estimate effort for each fix (LOW/MEDIUM/HIGH)

## Invocation Modes

### Mode 1: Pre-submission Review (proactive)
```
Task(subagent_type="paper-critic", prompt="Review manuscript at [path]. Full devil's advocate analysis.")
```
Run the complete 6-step protocol below on the manuscript.

### Mode 2: Respond to External Review (reactive)
```
Task(subagent_type="paper-critic", prompt="External review below. Evaluate objectivity and build response. [paste review text]")
```
For each reviewer point:
1. Rate objectivity (1-5): is this a real weakness or a misunderstanding?
2. Check our data: CAN we answer this with existing files?
3. If YES → write the specific response with file references
4. If NO → propose what analysis/experiment is needed + effort estimate
5. Prioritize: P0 (must fix before submission), P1 (strengthens paper), P2 (nice to have)

### Mode 3: Claim Audit (targeted)
```
Task(subagent_type="paper-critic", prompt="Audit claim: '[specific claim text]'. Check against source data.")
```
Trace ONE specific claim back to its data source. Verify every number.

## 6-Step Review Protocol (Mode 1)

### Step 1: Claim Extraction
Read the manuscript. Extract ALL factual claims into a structured list:
- Quantitative claims (numbers, p-values, effect sizes, counts)
- Qualitative claims (mechanism assertions, tool comparisons)
- Novelty claims ("first to show", "no existing tool")
- Scope claims ("across N loci", "N variants")

### Step 2: Source Verification
For EACH quantitative claim, trace it to a source file:
```
Claim: "54 architecture-driven variants (Class B)"
Source: results/HBB_Unified_Atlas_30kb.csv → filter LSSIM<0.95 & VEP<0.5 → count
Status: VERIFIED / MISMATCH / UNVERIFIABLE
```

Key data files to check:
- `results/*_Unified_Atlas_*.csv` — per-locus variant atlases
- `analysis/*.json` — analysis summaries
- `results/cross_locus_atlas_comparison.json` — cross-locus metrics
- `analysis/taxonomy_assignment_table.csv` — class assignments
- `analysis/taxonomy_auto_assignment.csv` — automated classification

### Step 3: Logical Structure Audit
Check for:
- **Circularity**: Are classes defined by tool outputs, then used to validate those tools?
- **Post-hoc framing**: Are exploratory results presented as confirmatory?
- **Cherry-picking**: Are unfavorable results omitted or downplayed?
- **Overclaiming**: Does the language exceed what the data supports?
- **Generalizability gap**: Are single-locus results generalized without qualification?

For each issue found, classify severity:
- 🔴 CRITICAL: Could invalidate a central claim
- 🟡 MAJOR: Weakens credibility, needs addressing
- 🟢 MINOR: Easy fix, cosmetic or clarification

### Step 4: Statistical Rigor Check
For each statistical test reported:
- Is the test appropriate for the data type?
- Are multiple comparisons corrected?
- Are effect sizes reported alongside p-values?
- Are confidence intervals provided?
- Could the result be an artifact of sample size?

### Step 5: Missing Evidence Analysis
What evidence SHOULD be in the paper but ISN'T?
- Negative controls that would strengthen claims
- Sensitivity analyses for key parameters
- Alternative explanations that should be discussed
- Comparison with competing methods/models

### Step 6: Objection-Response Table
Compile a final table:

```markdown
| # | Likely Objection | Severity | Can We Answer? | Response Strategy | Data Source | Effort |
|---|-----------------|----------|----------------|-------------------|-------------|--------|
| 1 | N=1 locus for Class B | 🔴 | PARTIAL | SCN5A cardiac + reframe | analysis/scn5a_cardiac_comparison.json | LOW |
| 2 | Threshold sensitivity | 🟡 | YES | EXP-004 + permutation test | analysis/exp004_*.json | MEDIUM |
```

## Response Format

### For Pre-submission (Mode 1):
```
## Critic Report: [manuscript name]

### Executive Summary
[2-3 sentences: overall assessment, biggest risks]

### Critical Issues (🔴)
[numbered list with full analysis]

### Major Issues (🟡)
[numbered list]

### Minor Issues (🟢)
[numbered list]

### Objection-Response Table
[the table from Step 6]

### Recommended Action Plan
[prioritized list of fixes with effort estimates]
```

### For External Review Response (Mode 2):
```
## Review Response Plan

### Reviewer Objectivity: [X/10]
[brief assessment of review quality]

### Point-by-Point Response
| # | Reviewer Point | Objectivity | Our Data | Response | Effort |
|---|---------------|-------------|----------|----------|--------|

### Priority Actions
P0 (must-fix): [list]
P1 (strengthen): [list]
P2 (optional): [list]

### Text Changes Needed
[specific paragraphs to rewrite, with before/after]
```

## Key Project Files Reference

### Manuscripts
- `manuscript/taxonomy_paper/full_draft.md` — taxonomy paper (markdown)
- `manuscript/taxonomy_paper/body_content.typ` — taxonomy paper (Typst)
- `manuscript/body_content.typ` — core ARCHCODE paper (Typst)
- `manuscript/biorxiv_version/body_content.typ` — bioRxiv version

### Data Sources (verify claims against these)
- `results/cross_locus_atlas_comparison.json` — 13-locus summary
- `results/*_Unified_Atlas_*.csv` — per-locus variant data
- `analysis/taxonomy_assignment_table.csv` — class assignments
- `analysis/taxonomy_auto_assignment.csv` — automated Q2 classification
- `analysis/gnomad_constraint_taxonomy.json` — gnomAD constraint
- `analysis/gwas_archcode_overlay.json` — GWAS overlay
- `analysis/scn5a_cardiac_comparison.json` — tissue-match experiment
- `analysis/exp001_ablation_results.json` — ablation study
- `analysis/exp003_tissue_mismatch.json` — tissue-mismatch controls
- `analysis/exp004_threshold_robustness.json` — threshold sensitivity

### Experiment Results
- `analysis/exp001_*.json` through `analysis/exp008_*.json`
- `results/validation_canonical_index_2026-03-06.json`
- `results/publication_claim_matrix_2026-03-06.json`

## Anti-Hallucination Rules

1. If you cannot find a data file → say "FILE NOT FOUND: [path]" — do NOT guess contents
2. If a number in the manuscript doesn't match the source file → report MISMATCH with both values
3. If a claim has no traceable data source → mark as UNGROUNDED
4. If you're uncertain about a biological interpretation → mark as [UNCERTAIN] and explain why
5. NEVER generate fake reviewer objections — only flag real logical/statistical/methodological issues
6. When suggesting a fix, verify that the proposed data/analysis actually exists before recommending it
Loading
Loading