feat(m2a-wettest): She-Proves Tier A wet test — 8 clips, M2a prosody by shaypal5 · Pull Request #2 · DataHackIL/avdp-synth-corpus

shaypal5 · 2026-04-15T16:35:05Z

Summary

8 She-Proves Tier A clips across all 4 violence typologies (SV × 2, IT × 2, NEG × 2, NEU × 2)
Generated with M2a SSML prosody defaults from SynthBanshee PR #24
Includes TTS utterance cache (assets/speech/) and LLM script cache (assets/scripts/) for full pipeline reproducibility

Wet test results

Check	Value	Threshold	Status
VIC I1 F0	160.1 Hz	≤ 200 Hz	✅
VIC I4 F0	185.0 Hz	< 250 Hz	✅
VIC I5 F0	200.1 Hz	< 250 Hz	✅
AGG pitch escalation	flat / rate-driven	no helium artefact	✅ (subjective)
AGG RMS escalation	+0.4 dB	≥ 8 dB	⏳ pending M3

AGG RMS escalation is a known Azure normalization limitation — per-turn gain will be applied in M3 (SceneMixer).

Test plan

Listen to at least one clip per typology and confirm no obvious synthesis artefacts
Confirm VIC sounds subdued/flat under stress (not rising pitch)
Confirm AGG sounds faster/more intense at high intensity (not helium)

🤖 Generated with Claude Code

8 clips across all 4 violence typologies (SV, IT, NEG, NEU) for she_proves Tier A. Generated with M2a SSML prosody defaults: - VIC pitch_delta_st capped falling (-4 → -1 st across I1–I5) - AGG pitch_delta_st flattened (0/0/+1/+1 across I2–I5) Includes TTS utterance cache (assets/speech/) and LLM script cache (assets/scripts/) for full pipeline reproducibility. Wet test results: VIC I1 F0: 160.1 Hz ✓ (≤200 Hz) VIC I4 F0: 185.0 Hz ✓ (<250 Hz) VIC I5 F0: 200.1 Hz ✓ (<250 Hz) AGG pitch: flat/rate-driven ✓ (subjective — no helium artefact) AGG RMS escalation: pending M3 (SceneMixer per-turn gain) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

DELIVERIES.md — master log table (one row per merged PR) linked from README, with clip counts, duration, typology breakdown, pipeline milestone, status, and PR link. deliveries/001-debug-run-1/ — metadata.yaml + notes.md for PR #1 (single debug clip, v1 pipeline, status: superseded). deliveries/002-m2a-wettest/ — metadata.yaml + notes.md for PR #2 (8 clips, M2a prosody milestone, status: provisional). Includes full prosody QA table and subjective QA results. README.md — adds "Delivery history" section linking to DELIVERIES.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds Delivery 002 (“m2a-wettest”) synthetic Hebrew She-Proves Tier A wet-test batch (8 clips across SV/IT/NEG/NEU) along with delivery logging and pipeline reproducibility caches/metadata.

Changes:

Adds delivery notes + structured metadata for deliveries 001 and 002, and introduces a top-level delivery log (DELIVERIES.md) referenced from the README.
Adds 8 new Tier A clips (transcripts + strong labels + clip metadata) and a language-level data/he/manifest.csv.
Adds cached scene scripts under assets/scripts/ for reproducibility.

Reviewed changes

Copilot reviewed 39 out of 193 changed files in this pull request and generated 17 comments.

Show a summary per file

File	Description
deliveries/002-m2a-wettest/notes.md	Delivery 002 narrative notes, QA summary, limitations
deliveries/002-m2a-wettest/metadata.yaml	Delivery 002 structured metadata (counts, QA metrics, speakers, engine)
deliveries/001-debug-run-1/notes.md	Delivery 001 narrative notes (superseded)
deliveries/001-debug-run-1/metadata.yaml	Delivery 001 structured metadata (superseded)
DELIVERIES.md	New delivery log table + status definitions
README.md	Adds “Delivery history” section pointing to `DELIVERIES.md`
data/he/manifest.csv	Adds dataset manifest for the 8 clips
data/he/agg_m_30-45_001/sp_sv_a_0001_00.txt	Transcript for SV clip 0001
data/he/agg_m_30-45_001/sp_sv_a_0001_00.jsonl	Strong labels for SV clip 0001
data/he/agg_m_30-45_001/sp_sv_a_0001_00.json	Clip metadata for SV clip 0001
data/he/agg_m_30-45_001/sp_sv_a_0002_00.txt	Transcript for SV clip 0002
data/he/agg_m_30-45_001/sp_sv_a_0002_00.jsonl	Strong labels for SV clip 0002
data/he/agg_m_30-45_001/sp_sv_a_0002_00.json	Clip metadata for SV clip 0002
data/he/agg_m_30-45_001/sp_it_a_0001_00.txt	Transcript for IT clip 0001 (updated content/timing)
data/he/agg_m_30-45_001/sp_it_a_0001_00.jsonl	Strong labels for IT clip 0001 (updated)
data/he/agg_m_30-45_001/sp_it_a_0001_00.json	Clip metadata for IT clip 0001 (updated)
data/he/agg_m_30-45_001/sp_it_a_0002_00.txt	Transcript for IT clip 0002
data/he/agg_m_30-45_001/sp_it_a_0002_00.jsonl	Strong labels for IT clip 0002
data/he/agg_m_30-45_001/sp_it_a_0002_00.json	Clip metadata for IT clip 0002
data/he/agg_m_30-45_001/sp_neg_a_0001_00.txt	Transcript for NEG clip 0001
data/he/agg_m_30-45_001/sp_neg_a_0001_00.jsonl	Strong labels for NEG clip 0001
data/he/agg_m_30-45_001/sp_neg_a_0001_00.json	Clip metadata for NEG clip 0001
data/he/agg_m_30-45_001/sp_neg_a_0002_00.txt	Transcript for NEG clip 0002
data/he/agg_m_30-45_001/sp_neg_a_0002_00.jsonl	Strong labels for NEG clip 0002
data/he/agg_m_30-45_001/sp_neg_a_0002_00.json	Clip metadata for NEG clip 0002
data/he/agg_m_30-45_001/sp_neu_a_0001_00.txt	Transcript for NEU clip 0001
data/he/agg_m_30-45_001/sp_neu_a_0001_00.jsonl	Strong labels for NEU clip 0001
data/he/agg_m_30-45_001/sp_neu_a_0001_00.json	Clip metadata for NEU clip 0001
data/he/agg_m_30-45_001/sp_neu_a_0002_00.txt	Transcript for NEU clip 0002
data/he/agg_m_30-45_001/sp_neu_a_0002_00.jsonl	Strong labels for NEU clip 0002
data/he/agg_m_30-45_001/sp_neu_a_0002_00.json	Clip metadata for NEU clip 0002
assets/scripts/ca24e3d5f040452f2df8df3b2cd7d8d34491c2f4c896f9e08f397fdd61ce2b69.json	Cached generated script (scene turns)
assets/scripts/c507234e77bfe487531218f506cdf223e3ae4d1dfcd356cb09c0c3ecbf154663.json	Cached generated script (scene turns)
assets/scripts/a58e4534b8cb00c83b01c358afb9039f17568297dad3c8d257a84194fe28885f.json	Cached generated script (scene turns)
assets/scripts/9e184d6024e8ca23d5afd44d01a00d89ae647b03f14faab76cabe66e00ba0965.json	Cached generated script (scene turns)
assets/scripts/804926fb260415c74f62fa3332dc185e50180551ba2e5c24b61bf18dd6545fbb.json	Cached generated script (scene turns)
assets/scripts/69b29ad8e90d20476c22282fc43528eae4d1a2dda1b7dd1c0fe6628287417903.json	Cached generated script (scene turns)
assets/scripts/498b07ded13fdae1ae18e3075afb85373bccfc863d9acefeb9f0ac6b4f3c0cba.json	Cached generated script (scene turns)
assets/scripts/0743595033eb507e3f61aaf1a11f150a700182efe045fce0b43bc8bbf13fc881.json	Cached generated script (scene turns)

Comments suppressed due to low confidence (1)

data/he/agg_m_30-45_001/sp_it_a_0001_00.json:58

transcript_path is an absolute local path and includes a workstation username, which breaks portability and leaks environment-specific information. Also, weak_label.has_violence introduces a binary violence flag that conflicts with the repo spec in README.md (no binary violence/non-violence labels). Prefer repo-relative transcript_path (or omit it) and remove/rename has_violence unless the spec is updated.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…hreads Adds two GitHub Actions workflows using shaypal5/pr-agent-context@v4: - ci.yml: initial run on pull_request events - pr-agent-context-refresh.yml: refresh flow triggered by review events and check_run completions, using publish_mode=append and include_outdated_review_threads=true so stale diff threads remain visible Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…sh on push - Remove `secrets: github_token:` from both workflows; the reusable workflow uses github.token implicitly and defines no secrets inputs - Add `pull_request: synchronize` trigger to refresh workflow so it runs on every new commit pushed to the PR - Fall back to github.run_id in concurrency group for check_run events where pull_requests array may be empty Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- manifest.csv: make wav_path/strong_labels_path repo-relative; lowercase speaker_ids to match on-disk directory naming convention - All clip .json: make transcript_path repo-relative - 7 clip .json: set dirty_file_path null where dirty WAV is absent from repo (sp_it_a_0001_00 retains its path — dirty file is present) - README: replace prohibition on binary labels with accurate derived-field policy — has_violence is a legitimate convenience field computed from violence_typology/violence_categories/max_intensity; taxonomy is ground truth Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

CLAUDE.md — full agent/contributor policy file covering: - Cache integrity rules (assets/speech, assets/scripts) - Clip file integrity and ASCII filename conventions - Delivery log requirements (DELIVERIES.md + deliveries/{slug}/) - Label policy: has_violence is a derived convenience field, not to be removed; taxonomy is ground truth; replacing taxonomy with a binary flag is prohibited - Audio format spec - How clips get here (SynthBanshee env vars) - Validation commands - What NOT to do (consolidated list) README.md — add agent callout box at the top pointing to CLAUDE.md; add "Agent and contributor guidelines" section at bottom. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-04-16T16:00:16Z

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR
#2. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.18
Trigger: commit pushed
Workflow run: 24520427940 attempt 1
Comment timestamp: 2026-04-16T15:59:27.377987+00:00
PR head commit: e191abcd1597d26b3b31c581effbddf69dfd42b8

shaypal5 force-pushed the feat/initial-debug-run-1 branch from 07f969d to 2a20b15 Compare April 16, 2026 06:36

pre-commit-ci Bot and others added 2 commits April 16, 2026 06:40

[pre-commit.ci] auto fixes from pre-commit.com hooks

322e03f

for more information, see https://pre-commit.ci

shaypal5 requested a review from Copilot April 16, 2026 13:49

shaypal5 self-assigned this Apr 16, 2026

shaypal5 added the data label Apr 16, 2026

Copilot started reviewing on behalf of shaypal5 April 16, 2026 13:49 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

shaypal5 and others added 2 commits April 16, 2026 17:21

This comment has been minimized.

Sign in to view

shaypal5 merged commit 839555b into main Apr 16, 2026
3 checks passed

shaypal5 deleted the feat/initial-debug-run-1 branch April 16, 2026 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(m2a-wettest): She-Proves Tier A wet test — 8 clips, M2a prosody#2

feat(m2a-wettest): She-Proves Tier A wet test — 8 clips, M2a prosody#2
shaypal5 merged 7 commits into
mainfrom
feat/initial-debug-run-1

shaypal5 commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaypal5 commented Apr 15, 2026

Summary

Wet test results

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

github-actions Bot commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants