VHS Timecode Calibration

VHS Timecode Calibration System (V2)

Summary

The VHS Timecode Calibration System enables precise audio/video synchronisation for Domesday Duplicator captures. It works by encoding frame numbers into both video (visual binary strip) and audio (FSK tones), then comparing them after VHS playback to measure the exact offset between audio and video.

Key Features:

62-second calibration cycles with machine-readable lead-in/lead-out structure
16-bit encoding optimised for VHS degradation
Red/blue colour encoding (more robust than grayscale)
400/800 Hz FSK audio (optimised for VHS linear audio track)
1200 Hz pilot tone for frame synchronisation
Multi-row visual encoding with majority voting
Confidence-based decoding with explicit failure reporting

Why This Exists

When capturing VHS with the Domesday Duplicator:

Video (RF) and audio are captured by different hardware
They start at different times with no synchronisation trigger
VHS playback speed varies slightly from the nominal rate
There's approximately 1 second of capture startup delay

This system solves the direction ambiguity problem. Without absolute frame numbers, you can detect that audio and video are offset but not reliably which direction. By encoding the same absolute frame number in both streams (visually on the video, as an FSK burst in the audio), the offset direction is unambiguous — if video shows frame 100 when audio encodes frame 102, audio is definitively 2 frames ahead.

How It Works

The Calibration Workflow

Generate calibration video using the pattern generator
Burn to DVD (or output via capture card)
Record DVD playback to VHS tape
Capture VHS with Domesday Duplicator + audio interface
Process through vhs-decode (TBC + clock sync)
Analyse timecodes to calculate offset
Apply offset to future captures from that setup

The 62-Second Calibration Cycle

Each cycle has 7 sections with distinct, machine-readable markers:

Section 1: LEADER      (10s) - 0xFFFF pattern, VCR settling time
Section 2: COUNTDOWN   (5s)  - "11" prefix, frames until timecode
Section 3: SEPARATOR   (1s)  - 0x0000 pattern, transition marker
Section 4: TIMECODE    (30s) - "10" prefix + frame number + "01" suffix
Section 5: SEPARATOR   (1s)  - 0x0000 pattern, transition marker
Section 6: COUNT-UP    (5s)  - "00" prefix, frames since timecode
Section 7: TAIL        (10s) - 0xFFFF pattern, cycle complete

The decoder uses a state machine to track which section it's in - no guessing required.

Visual Encoding (Binary Strip)

The top 60 pixels of each frame contain a machine-readable binary strip:

Feature	V2 Specification
Strip height	60 pixels (3 rows of 20 pixels)
Number of bits	16
Block width	~40 pixels each
Bit '1' colour	Red (BGR: 0, 0, 255)
Bit '0' colour	Blue (BGR: 255, 0, 0)
Background	Mid-gray (128, 128, 128)

Why red/blue instead of white/black?

VHS records video as separate luma (brightness) and chroma (colour) signals. White and gray differ only in brightness - a noise spike can easily shift one toward the other. Red and blue are on opposite ends of the colour spectrum, requiring the entire colour to flip to cause a misread, not just brightness shifting.

Why 3 rows?

VHS can cause horizontal streaking, dropouts, or noise that affects part of the strip. With 3 rows, the decoder uses majority voting - if 2 out of 3 rows agree, that's the bit value. Single-row corruption doesn't cause failure.

Audio Encoding (FSK)

Frame numbers are encoded as Frequency Shift Keying tones:

Feature	V2 Specification
Bit '0' frequency	400 Hz
Bit '1' frequency	800 Hz
Pilot tone	1200 Hz (frame sync)
Sample rate	48000 Hz
Bits per frame	16

Frame structure:

10% pilot tone (1200 Hz) - for frame boundary detection
5% silence - separator
80% FSK data - the actual encoded bits
5% silence - separator

The pilot tone allows the decoder to find exact frame boundaries even when timing has drifted.

16-Bit Frame Encoding

Bits 0-1:   Prefix marker (identifies frame type)
Bits 2-13:  Data payload (depends on frame type)
Bits 14-15: Suffix marker (for timecode frames only)

Frame Types:

Prefix	Suffix	Type	Data Content
`11`	-	Countdown	4-bit countdown (5,4,3,2,1) + 10-bit frames until timecode
`10`	`01`	Timecode	12-bit frame number (0-4095)
`00`	-	Lead-out	4-bit count-up (1,2,3,4,5) + 10-bit frames since timecode
All 1s	-	Leader/Tail	0xFFFF = preparation or cycle complete
All 0s	-	Separator	0x0000 = section boundary

12 bits supports 4096 frames = 164 seconds at 25fps (PAL) - more than enough for 30-second calibration windows.

Using the Calibration System

From the Main Menu

The calibration system is accessed via Menu 3 → Robust Timecode Calibration:

STEP 1 - PREPARATION:
  1. Generate VHS Calibration Pattern (62s V2 Cycles)
  2. Create DVD ISOs from MP4s
  3. Burn DVD (built-in or external tool)

STEP 2 - RECORD:
  4. Record DVD playback to VHS tape (at least 2 minutes)

STEP 3 - CALIBRATE:
  5. Toggle Calibration Mode ON
  6. Capture calibration VHS (uses fixed name "calibration")
  7. Process through Workflow Control Centre: Decode → Export
  8. Analyse calibration → calculates and saves offset

Generating Calibration Video

From the menu, select option 1 to generate calibration video:

# Or manually:
cd tools/timecode-generator
python vhs_pattern_generator.py --cycles 2 --format PAL --output calibration.mp4

Options:

--cycles N - Number of 62-second cycles (default: 2)
--format PAL|NTSC - Video format
--output FILE - Output MP4 path

Cycle count suggestions:

2 cycles = ~2 minutes (recommended for calibration)
10 cycles = ~10 minutes (testing/verification)

Processing Captured VHS

Important: For calibration, only Decode and Export steps are needed in the Workflow Control Centre.

1. Capture with Calibration Mode ON → calibration.lds, calibration.flac
2. Run vhs-decode → calibration.tbc
3. Run tbc-video-export → calibration_ffv1.mkv
4. Analyse calibration (compares video with RAW audio)

Why RAW audio? The calibration analysis uses calibration.flac (raw audio) rather than calibration_aligned.flac. This is because VhsDecodeAutoAudioAlign applies TBC timing corrections that assume audio is already roughly aligned - which it isn't during calibration. Using aligned audio would corrupt the offset measurement.

Multi-Point Offset Calculation

The system uses a sliding window search to find timecodes in both video and audio:

Video scanning: Decodes visual timecodes from video frames in the first 62-second cycle
Audio scanning: Uses 1/8 frame resolution sliding window to find FSK timecodes (VHS wow/flutter means FSK isn't aligned to exact frame boundaries)
Matching: Finds timecodes that appear in both video and audio
Outlier filtering: Removes matches more than 5 frames from the median offset (catches spurious decodes from non-timecode sections)
Calculation: Computes median offset from remaining consistent matches

Example output:

Consistent timecodes analyzed: 110
Median offset: -15.34 frames
Std deviation: 0.578 frames
Offset in seconds: -0.6134s (-613.4ms)

A standard deviation under 1 frame indicates excellent consistency. Higher values may indicate VHS quality issues or timing drift.

Technical Details

Decoder State Machine

SEARCHING → IN_LEADER → IN_COUNTDOWN → READY_FOR_TIMECODE
                                              ↓
    CYCLE_COMPLETE ← IN_LEADOUT ← TIMECODE_COMPLETE ← READING_TIMECODE

The decoder tracks state based on the prefix bits it reads:

0xFFFF (all ones) → Leader or Tail section
0x0000 (all zeros) → Separator section
"11" prefix → Countdown section
"10" prefix + "01" suffix → Timecode frame
"00" prefix → Lead-out section

Visual Decoding: TBC Resolution Handling

The calibration video is generated at 720x576 (PAL), but vhs-decode outputs TBC at 928x576. The decoder automatically detects the active content area:

TBC Output (928px):
┌────────────────────────────────────────────────────────┐
│ Black │        Active Content (720px)        │ Black │
│ ~104px│                                      │ ~104px│
└────────────────────────────────────────────────────────┘

The decoder scans brightness in the top strip to find content boundaries, then decodes the binary strip relative to the active area.

Visual Decoding: Center Sampling

The decoder doesn't average the entire block - it samples only the center to avoid edge contamination from VHS horizontal shift:

Block (40px wide):
┌──────────────────────────────────────┐
│ Skip │    Sample Center    │ Skip │
│ 25%  │       50%           │ 25%  │
└──────────────────────────────────────┘

This avoids:

Left edge transition zones (blurry after VHS)
Right edge contamination from adjacent blocks
Top/bottom artifacts from head switching

Three-Row Voting with Confidence Scoring

For each of the 16 bits, the decoder:

Samples all 3 rows independently
- Compares red vs blue channel intensity
- Calculates per-row confidence from colour separation
Majority vote determines bit value
- 2+ rows say '1' → bit is '1'
- Otherwise → bit is '0'
Confidence calculation with disagreement penalty

# Base confidence = average of agreeing rows
agreeing_confidences = [c for bit, c in row_results if bit == winning_bit]
base_confidence = mean(agreeing_confidences)

# Penalise based on dissenting row's confidence
if any_rows_disagree:
    dissent_penalty = mean(dissenting_confidences) * 0.5
    bit_confidence = base_confidence * (1 - dissent_penalty)
else:
    bit_confidence = base_confidence

Example impact:

Row 0	Row 1	Row 2	Result	Confidence
1 (0.8)	1 (0.8)	1 (0.8)	1	0.80 (no penalty)
1 (0.8)	1 (0.8)	0 (0.1)	1	0.76 (5% penalty)
1 (0.8)	1 (0.8)	0 (0.8)	1	0.48 (40% penalty)

This ensures high-confidence disagreement properly triggers low-confidence warnings, rather than being silently ignored.

Explicit Failure Modes

The decoder never silently falls back to degraded algorithms. Instead, it reports explicit failure reasons:

Status	Meaning
`OK`	Frame decoded successfully
`LOW_CONFIDENCE`	Overall confidence below 15% threshold
`TOO_MANY_UNCERTAIN_BITS:N`	More than 2 bits with low confidence
`INVALID_MARKERS`	Prefix/suffix bits don't match any valid frame type
`INVALID_FRAME`	Frame couldn't be read at all

This makes debugging much easier - you know exactly why a frame failed to decode.

Audio Decoding Methods

The calibration system uses Zero-Crossing Rate (ZCR) analysis as the primary audio decoding method:

Why ZCR?

The V2 encoding uses 400Hz/800Hz FSK with 16 bits per frame. At 78125Hz sample rate (Rene Wolf Sound Card), each bit has only ~156 samples. This is insufficient for reliable FFT-based frequency detection (FFT bins would be too coarse). ZCR works with partial cycles:

# ZCR frequency estimation
crossings = count_zero_crossings(bit_audio)
estimated_freq = crossings / len(bit_audio) * sample_rate / 2
bit = '0' if estimated_freq < 600 else '1'  # 600Hz threshold

Frame structure handling:

First 15% skipped (pilot tone + silence)
Middle 80% decoded as FSK data
Last 5% skipped (trailing silence)

Sliding window search:

Because VHS wow/flutter shifts timing, FSK timecodes don't align to exact frame boundaries. The decoder uses 1/8 frame steps to find where valid timecodes actually occur:

Frame boundary search with 1/8 frame resolution:
Position 293.904: TC=44 VALID
Position 294.902: TC=45 VALID  (delta ~1.0 frame)
Position 295.901: TC=46 VALID  (delta ~1.0 frame)

Consecutive timecodes with ~1.0 frame spacing confirm successful decoding.

Perspective Correction

The corner markers (red at top-left/bottom-right, blue at top-right/bottom-left) enable geometric correction:

# Detect corners → compute perspective transform → correct frame
corrected_frame = cv2.warpPerspective(frame, transform_matrix, (width, height))

This corrects for rotation, scaling, and skew from VHS playback or capture.

Files and Locations

Generator files:

File	Purpose
`tools/timecode-generator/vhs_pattern_generator.py`	Generate calibration MP4 with cycles
`tools/timecode-generator/vhs_timecode_generator.py`	Single-frame generation
`tools/timecode-generator/shared_timecode_robust.py`	Core encoding/decoding
`docs/v2-timecode-implementation-plan.md`	Implementation plan and technical details

Calibration workflow files (in temp/):

File	Purpose
`calibration.lds`	RF capture from DomesdayDuplicator
`calibration.flac`	Raw audio capture (used for analysis)
`calibration.tbc`	TBC output from vhs-decode
`calibration.tbc.json`	TBC metadata
`calibration_ffv1.mkv`	Exported video (used for analysis)

Note: calibration_aligned.flac and calibration_final.mkv are NOT used for calibration analysis - they have TBC timing corrections applied that would corrupt the offset measurement.

Frequency Ranges and Guard Bands

       ┌─────────────────┐
       │   Bit '0'       │   Guard    ┌─────────────────┐
       │   400 Hz        │   Band     │   Bit '1'       │
       │  (300-500 Hz)   │ (500-650)  │   800 Hz        │
       └─────────────────┘            │  (650-950 Hz)   │
                                      └─────────────────┘
                                          Guard    ┌─────────────────┐
                                          Band     │   Pilot         │
                                        (950-1050) │   1200 Hz       │
                                                   │ (1050-1350 Hz)  │
                                                   └─────────────────┘

The frequencies were chosen to:

Stay well within VHS linear audio passband (100 Hz - 10 kHz)
Maintain 2:1 ratio for clear differentiation
Have sufficient guard bands to prevent overlap after VHS frequency shift

DDD Capture Toolkit

Home

Getting Started

Features

Internals

Reference

Troubleshooting

Quick Reference

Workflow Commands:

1D - Decode project 1
1M - Compress project 1
1E - Export project 1
1A - Align audio
1F - Final mux
1X - Project settings
1mv - Validate compressed master (Tier 3)
hash 1 - Hash files lacking a recorded hash
check 1 - Re-hash and compare to log

Key Features:

PAL/NTSC auto-detect
Reverse field order (automatic)
Segment testing mode
Three-tier compress validation
Automatic checksums + per-project validation log

VHS Timecode Calibration

VHS Timecode Calibration System (V2)

Summary

Why This Exists

How It Works

The Calibration Workflow

The 62-Second Calibration Cycle

Visual Encoding (Binary Strip)

Audio Encoding (FSK)

16-Bit Frame Encoding

Using the Calibration System

From the Main Menu

Generating Calibration Video

Processing Captured VHS

Multi-Point Offset Calculation

Technical Details

Decoder State Machine

Visual Decoding: TBC Resolution Handling

Visual Decoding: Center Sampling

Three-Row Voting with Confidence Scoring

Explicit Failure Modes

Audio Decoding Methods

Perspective Correction

Files and Locations

Frequency Ranges and Guard Bands

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DDD Capture Toolkit

Getting Started

Features

Internals

Reference

Quick Reference

Clone this wiki locally