-
Notifications
You must be signed in to change notification settings - Fork 0
VHS Timecode Calibration
The VHS Timecode Calibration System enables precise audio/video synchronisation for Domesday Duplicator captures. It works by encoding frame numbers into both video (visual binary strip) and audio (FSK tones), then comparing them after VHS playback to measure the exact offset between audio and video.
Key Features:
- 62-second calibration cycles with machine-readable lead-in/lead-out structure
- 16-bit encoding optimised for VHS degradation
- Red/blue colour encoding (more robust than grayscale)
- 400/800 Hz FSK audio (optimised for VHS linear audio track)
- 1200 Hz pilot tone for frame synchronisation
- Multi-row visual encoding with majority voting
- Confidence-based decoding with explicit failure reporting
When capturing VHS with the Domesday Duplicator:
- Video (RF) and audio are captured by different hardware
- They start at different times with no synchronisation trigger
- VHS playback speed varies slightly from the nominal rate
- There's approximately 1 second of capture startup delay
This system solves the direction ambiguity problem. Without absolute frame numbers, you can detect that audio and video are offset but not reliably which direction. By encoding the same absolute frame number in both streams (visually on the video, as an FSK burst in the audio), the offset direction is unambiguous — if video shows frame 100 when audio encodes frame 102, audio is definitively 2 frames ahead.
- Generate calibration video using the pattern generator
- Burn to DVD (or output via capture card)
- Record DVD playback to VHS tape
- Capture VHS with Domesday Duplicator + audio interface
- Process through vhs-decode (TBC + clock sync)
- Analyse timecodes to calculate offset
- Apply offset to future captures from that setup
Each cycle has 7 sections with distinct, machine-readable markers:
Section 1: LEADER (10s) - 0xFFFF pattern, VCR settling time
Section 2: COUNTDOWN (5s) - "11" prefix, frames until timecode
Section 3: SEPARATOR (1s) - 0x0000 pattern, transition marker
Section 4: TIMECODE (30s) - "10" prefix + frame number + "01" suffix
Section 5: SEPARATOR (1s) - 0x0000 pattern, transition marker
Section 6: COUNT-UP (5s) - "00" prefix, frames since timecode
Section 7: TAIL (10s) - 0xFFFF pattern, cycle complete
The decoder uses a state machine to track which section it's in - no guessing required.
The top 60 pixels of each frame contain a machine-readable binary strip:
| Feature | V2 Specification |
|---|---|
| Strip height | 60 pixels (3 rows of 20 pixels) |
| Number of bits | 16 |
| Block width | ~40 pixels each |
| Bit '1' colour | Red (BGR: 0, 0, 255) |
| Bit '0' colour | Blue (BGR: 255, 0, 0) |
| Background | Mid-gray (128, 128, 128) |
Why red/blue instead of white/black?
VHS records video as separate luma (brightness) and chroma (colour) signals. White and gray differ only in brightness - a noise spike can easily shift one toward the other. Red and blue are on opposite ends of the colour spectrum, requiring the entire colour to flip to cause a misread, not just brightness shifting.
Why 3 rows?
VHS can cause horizontal streaking, dropouts, or noise that affects part of the strip. With 3 rows, the decoder uses majority voting - if 2 out of 3 rows agree, that's the bit value. Single-row corruption doesn't cause failure.
Frame numbers are encoded as Frequency Shift Keying tones:
| Feature | V2 Specification |
|---|---|
| Bit '0' frequency | 400 Hz |
| Bit '1' frequency | 800 Hz |
| Pilot tone | 1200 Hz (frame sync) |
| Sample rate | 48000 Hz |
| Bits per frame | 16 |
Frame structure:
- 10% pilot tone (1200 Hz) - for frame boundary detection
- 5% silence - separator
- 80% FSK data - the actual encoded bits
- 5% silence - separator
The pilot tone allows the decoder to find exact frame boundaries even when timing has drifted.
Bits 0-1: Prefix marker (identifies frame type)
Bits 2-13: Data payload (depends on frame type)
Bits 14-15: Suffix marker (for timecode frames only)
Frame Types:
| Prefix | Suffix | Type | Data Content |
|---|---|---|---|
11 |
- | Countdown | 4-bit countdown (5,4,3,2,1) + 10-bit frames until timecode |
10 |
01 |
Timecode | 12-bit frame number (0-4095) |
00 |
- | Lead-out | 4-bit count-up (1,2,3,4,5) + 10-bit frames since timecode |
| All 1s | - | Leader/Tail | 0xFFFF = preparation or cycle complete |
| All 0s | - | Separator | 0x0000 = section boundary |
12 bits supports 4096 frames = 164 seconds at 25fps (PAL) - more than enough for 30-second calibration windows.
The calibration system is accessed via Menu 3 → Robust Timecode Calibration:
STEP 1 - PREPARATION:
1. Generate VHS Calibration Pattern (62s V2 Cycles)
2. Create DVD ISOs from MP4s
3. Burn DVD (built-in or external tool)
STEP 2 - RECORD:
4. Record DVD playback to VHS tape (at least 2 minutes)
STEP 3 - CALIBRATE:
5. Toggle Calibration Mode ON
6. Capture calibration VHS (uses fixed name "calibration")
7. Process through Workflow Control Centre: Decode → Export
8. Analyse calibration → calculates and saves offset
From the menu, select option 1 to generate calibration video:
# Or manually:
cd tools/timecode-generator
python vhs_pattern_generator.py --cycles 2 --format PAL --output calibration.mp4Options:
-
--cycles N- Number of 62-second cycles (default: 2) -
--format PAL|NTSC- Video format -
--output FILE- Output MP4 path
Cycle count suggestions:
- 2 cycles = ~2 minutes (recommended for calibration)
- 10 cycles = ~10 minutes (testing/verification)
Important: For calibration, only Decode and Export steps are needed in the Workflow Control Centre.
1. Capture with Calibration Mode ON → calibration.lds, calibration.flac
2. Run vhs-decode → calibration.tbc
3. Run tbc-video-export → calibration_ffv1.mkv
4. Analyse calibration (compares video with RAW audio)
Why RAW audio? The calibration analysis uses calibration.flac (raw audio) rather than calibration_aligned.flac. This is because VhsDecodeAutoAudioAlign applies TBC timing corrections that assume audio is already roughly aligned - which it isn't during calibration. Using aligned audio would corrupt the offset measurement.
The system uses a sliding window search to find timecodes in both video and audio:
- Video scanning: Decodes visual timecodes from video frames in the first 62-second cycle
- Audio scanning: Uses 1/8 frame resolution sliding window to find FSK timecodes (VHS wow/flutter means FSK isn't aligned to exact frame boundaries)
- Matching: Finds timecodes that appear in both video and audio
- Outlier filtering: Removes matches more than 5 frames from the median offset (catches spurious decodes from non-timecode sections)
- Calculation: Computes median offset from remaining consistent matches
Example output:
Consistent timecodes analyzed: 110
Median offset: -15.34 frames
Std deviation: 0.578 frames
Offset in seconds: -0.6134s (-613.4ms)
A standard deviation under 1 frame indicates excellent consistency. Higher values may indicate VHS quality issues or timing drift.
SEARCHING → IN_LEADER → IN_COUNTDOWN → READY_FOR_TIMECODE
↓
CYCLE_COMPLETE ← IN_LEADOUT ← TIMECODE_COMPLETE ← READING_TIMECODE
The decoder tracks state based on the prefix bits it reads:
-
0xFFFF(all ones) → Leader or Tail section -
0x0000(all zeros) → Separator section -
"11" prefix→ Countdown section -
"10" prefix + "01" suffix→ Timecode frame -
"00" prefix→ Lead-out section
The calibration video is generated at 720x576 (PAL), but vhs-decode outputs TBC at 928x576. The decoder automatically detects the active content area:
TBC Output (928px):
┌────────────────────────────────────────────────────────┐
│ Black │ Active Content (720px) │ Black │
│ ~104px│ │ ~104px│
└────────────────────────────────────────────────────────┘
The decoder scans brightness in the top strip to find content boundaries, then decodes the binary strip relative to the active area.
The decoder doesn't average the entire block - it samples only the center to avoid edge contamination from VHS horizontal shift:
Block (40px wide):
┌──────────────────────────────────────┐
│ Skip │ Sample Center │ Skip │
│ 25% │ 50% │ 25% │
└──────────────────────────────────────┘
This avoids:
- Left edge transition zones (blurry after VHS)
- Right edge contamination from adjacent blocks
- Top/bottom artifacts from head switching
For each of the 16 bits, the decoder:
-
Samples all 3 rows independently
- Compares red vs blue channel intensity
- Calculates per-row confidence from colour separation
-
Majority vote determines bit value
- 2+ rows say '1' → bit is '1'
- Otherwise → bit is '0'
-
Confidence calculation with disagreement penalty
# Base confidence = average of agreeing rows
agreeing_confidences = [c for bit, c in row_results if bit == winning_bit]
base_confidence = mean(agreeing_confidences)
# Penalise based on dissenting row's confidence
if any_rows_disagree:
dissent_penalty = mean(dissenting_confidences) * 0.5
bit_confidence = base_confidence * (1 - dissent_penalty)
else:
bit_confidence = base_confidenceExample impact:
| Row 0 | Row 1 | Row 2 | Result | Confidence |
|---|---|---|---|---|
| 1 (0.8) | 1 (0.8) | 1 (0.8) | 1 | 0.80 (no penalty) |
| 1 (0.8) | 1 (0.8) | 0 (0.1) | 1 | 0.76 (5% penalty) |
| 1 (0.8) | 1 (0.8) | 0 (0.8) | 1 | 0.48 (40% penalty) |
This ensures high-confidence disagreement properly triggers low-confidence warnings, rather than being silently ignored.
The decoder never silently falls back to degraded algorithms. Instead, it reports explicit failure reasons:
| Status | Meaning |
|---|---|
OK |
Frame decoded successfully |
LOW_CONFIDENCE |
Overall confidence below 15% threshold |
TOO_MANY_UNCERTAIN_BITS:N |
More than 2 bits with low confidence |
INVALID_MARKERS |
Prefix/suffix bits don't match any valid frame type |
INVALID_FRAME |
Frame couldn't be read at all |
This makes debugging much easier - you know exactly why a frame failed to decode.
The calibration system uses Zero-Crossing Rate (ZCR) analysis as the primary audio decoding method:
Why ZCR?
The V2 encoding uses 400Hz/800Hz FSK with 16 bits per frame. At 78125Hz sample rate (Rene Wolf Sound Card), each bit has only ~156 samples. This is insufficient for reliable FFT-based frequency detection (FFT bins would be too coarse). ZCR works with partial cycles:
# ZCR frequency estimation
crossings = count_zero_crossings(bit_audio)
estimated_freq = crossings / len(bit_audio) * sample_rate / 2
bit = '0' if estimated_freq < 600 else '1' # 600Hz thresholdFrame structure handling:
- First 15% skipped (pilot tone + silence)
- Middle 80% decoded as FSK data
- Last 5% skipped (trailing silence)
Sliding window search:
Because VHS wow/flutter shifts timing, FSK timecodes don't align to exact frame boundaries. The decoder uses 1/8 frame steps to find where valid timecodes actually occur:
Frame boundary search with 1/8 frame resolution:
Position 293.904: TC=44 VALID
Position 294.902: TC=45 VALID (delta ~1.0 frame)
Position 295.901: TC=46 VALID (delta ~1.0 frame)
Consecutive timecodes with ~1.0 frame spacing confirm successful decoding.
The corner markers (red at top-left/bottom-right, blue at top-right/bottom-left) enable geometric correction:
# Detect corners → compute perspective transform → correct frame
corrected_frame = cv2.warpPerspective(frame, transform_matrix, (width, height))This corrects for rotation, scaling, and skew from VHS playback or capture.
Generator files:
| File | Purpose |
|---|---|
tools/timecode-generator/vhs_pattern_generator.py |
Generate calibration MP4 with cycles |
tools/timecode-generator/vhs_timecode_generator.py |
Single-frame generation |
tools/timecode-generator/shared_timecode_robust.py |
Core encoding/decoding |
docs/v2-timecode-implementation-plan.md |
Implementation plan and technical details |
Calibration workflow files (in temp/):
| File | Purpose |
|---|---|
calibration.lds |
RF capture from DomesdayDuplicator |
calibration.flac |
Raw audio capture (used for analysis) |
calibration.tbc |
TBC output from vhs-decode |
calibration.tbc.json |
TBC metadata |
calibration_ffv1.mkv |
Exported video (used for analysis) |
Note: calibration_aligned.flac and calibration_final.mkv are NOT used for calibration analysis - they have TBC timing corrections applied that would corrupt the offset measurement.
┌─────────────────┐
│ Bit '0' │ Guard ┌─────────────────┐
│ 400 Hz │ Band │ Bit '1' │
│ (300-500 Hz) │ (500-650) │ 800 Hz │
└─────────────────┘ │ (650-950 Hz) │
└─────────────────┘
Guard ┌─────────────────┐
Band │ Pilot │
(950-1050) │ 1200 Hz │
│ (1050-1350 Hz) │
└─────────────────┘
The frequencies were chosen to:
- Stay well within VHS linear audio passband (100 Hz - 10 kHz)
- Maintain 2:1 ratio for clear differentiation
- Have sufficient guard bands to prevent overlap after VHS frequency shift
- Project Flags
- Segment Mode
- Audio Synchronisation
- VHS Timecode Calibration
- Compress Validation
- Checksums and Verification
Workflow Commands:
-
1D- Decode project 1 -
1M- Compress project 1 -
1E- Export project 1 -
1A- Align audio -
1F- Final mux -
1X- Project settings -
1mv- Validate compressed master (Tier 3) -
hash 1- Hash files lacking a recorded hash -
check 1- Re-hash and compare to log
Key Features:
- PAL/NTSC auto-detect
- Reverse field order (automatic)
- Segment testing mode
- Three-tier compress validation
- Automatic checksums + per-project validation log