-| Dimension | Value |
-|-----------|-------|
-| Clips | 20 |
-| Total duration | ~41.6 min |
-| Projects | `she_proves` (12 clips) + `elephant_in_the_room` (8 clips) |
-| Tiers | A — clean (12) + B — room-augmented (8) |
-| TTS backends | Azure (18 clips) + Google Chirp 3 HD (2 clips) |
-| Validation failures | 0 / 20 |
-| Pipeline | SynthBanshee `0.1.0` @ [`1ea48f3`](https://github.com/DataHackIL/SynthBanshee/commit/1ea48f3) |
+
+
smartphone app
+### She-Proves
+Passively monitors a phone for domestic-violence incidents and preserves audio evidence for legal use. **High-recall** orientation — better to flag and review than to miss.
-Full breakdown: [Deliveries](deliveries.md) · [She-Proves clips](she-proves.md#clips-in-delivery-003) · [Elephant clips](elephant.md#clips-in-delivery-003)
+12 clips · Tier A (clean audio) · scenes 3–6 min · phone-pocket device profile.
----
+[She-Proves guide →](she-proves.md){ .card-link }
+
+
raspberry pi · clinic / welfare office
+### Elephant in the Room
+A Pi-class device that alerts security when a social worker is under threat. **High-precision** orientation — false alarms erode trust.
-```
-data/
- he/ # ISO 639-1 language code
- {speaker_dir}/ # e.g. agg_m_30-45_001/ (lowercase of first speaker ID)
- {clip_id}.wav # 16 kHz mono 16-bit PCM
- {clip_id}.txt # per-turn transcript with onset/offset markers
- {clip_id}.json # ClipMetadata (weak labels, provenance, speaker info)
- {clip_id}.jsonl # EventLabel records — one JSON object per line
- manifest.csv # flat summary of all clips under data/he/
-
-assets/
- speech/ # SHA-256-keyed per-utterance WAV cache (do not modify)
- dirty/ # pre-preprocessing WAVs, retained per spec
- scripts/ # SHA-256-keyed LLM script cache (do not modify)
-
-deliveries/
- {slug}/
- metadata.yaml # structured delivery record
- notes.md # narrative QA notes and known limitations
- qa-report.json # synthbanshee qa-report output
-```
+8 clips · Tier B (room IR + budget mic + noise) · scenes 1–4 min · alert in final 40%.
-??? info "Why are there four files per clip?"
- - **`.wav`** — the audio, spec-compliant (normalized, padded, validated)
- - **`.txt`** — the transcript with turn-level onset/offset markers, used as ASR reference
- - **`.json`** — `ClipMetadata`: weak labels (`has_violence`, `max_intensity`), speaker list, acoustic scene, provenance (`generation_metadata`)
- - **`.jsonl`** — `EventLabel` records: one line per strong-label event with category, subtype, onset, offset, intensity, emotional state
+[Elephant guide →](elephant.md){ .card-link }
+
- You only need `.wav` + `.json` for most training pipelines. Add `.jsonl` when you need per-event strong labels or onset/offset supervision.
+