-
Notifications
You must be signed in to change notification settings - Fork 0
Audio Synchronisation
Audio synchronisation is the core innovation of the DDD Capture Toolkit. This page explains how the sync system works and how to get the best results.
The Domesday Duplicator captures RF video from VHS tapes, but audio must be captured separately. This creates three major sync issues:
RF capture and audio capture are started by different processes, possibly on different hardware. Even milliseconds of difference creates noticeable sync issues.
RF and audio use independent clocks. Over a 2-hour tape, even tiny clock differences accumulate into seconds of drift.
Your VCR doesn't play at exactly 25fps (PAL) or 29.97fps (NTSC). It might be 25.0001fps, causing audio to slowly drift throughout playback.
Before the synchronised-capture system documented below existed, DdD users worked around the missing audio in three ways, all of which had real problems:
Run the DdD once for video RF, then again while recording audio separately.
- No two tape passes are identical — different tracking, different dropouts, different speed variations.
- You still have to manually align the audio.
- No reliable
.tbc.jsontiming data that corresponds to the audio pass. - Doubles capture time and tape wear.
Capture audio to a different recorder (e.g. a Mac via the player's audio outputs) while the DdD captures RF simultaneously.
- Still no clock synchronisation between devices, so drift accumulates over the tape.
- Still need to manually figure out the start offset.
- Different sample rates and timing references; manual extraction and alignment workflow.
Use one DdD for video RF, another configured for audio RF capture.
- Expensive — two DdDs.
- The original DomesdayDuplicator software was GUI-only with no way to start both captures programmatically at exactly the same moment.
- You still need to synchronise the clocks between the two units.
- Audio still needs alignment and drift compensation.
The underlying blocker all three of these ran into was that the DomesdayDuplicator software had no command-line interface. Without programmatic capture control, true automation wasn't possible, and "two captures at the same moment" was always going to have an uncontrolled offset. The first step of this toolkit's solution was to add command-line flags to the DomesdayDuplicator software itself — see Foundation: command-line control below.
The toolkit uses a three-part solution:
flowchart LR
A["Clockgen Lite<br/>(synchronises capture clocks)"] --> B["Offset Calculation<br/>(measures audio_delay)"]
B --> C["VhsDecodeAutoAudioAlign<br/>(corrects VCR speed wobble)"]
C --> D["Perfectly Synced Audio<br/>(in _final.mkv)"]
Before any of the three sync parts can do their job, both captures need to start on the same trigger. The original DomesdayDuplicator software was GUI-only — there was no way to start a capture programmatically, which made true synchronisation impossible. As part of this project, command-line flags were added to the DomesdayDuplicator software:
DomesdayDuplicator --start-capture --headless # start capture, no GUI
DomesdayDuplicator --stop-capture # stop capture programmaticallyThe capture script (ddd_clockgen_sync.py) uses these flags to start the RF capture and the audio capture together with precise timing. This is the foundation that makes everything else on this page possible.
Clockgen Lite is a hardware modification that clocks both the Domesday Duplicator and your audio ADC from the same master clock source.
Benefits:
- Sample-level synchronisation
- No clock drift between RF and audio
- Wow and flutter captured identically in both streams
Even with synchronised clocks, the two capture processes start at slightly different times. The toolkit measures this offset using the VHS Timecode Calibration system:
- Generate a calibration video with embedded timecodes (visual + audio FSK)
- Record to VHS and capture back
- Analyse to compare where the same timecode appears in video vs audio
- Save the measured offset to configuration
The offset is typically 500-700ms and remains consistent for your hardware setup. Once calibrated, the offset is automatically applied during capture.
VhsDecodeAutoAudioAlign by Rene Wolf handles VCR speed compensation:
How it works:
- Reads actual field timing from
.tbc.json(produced by vhs-decode) - Calculates exactly where each video field occurs in time
- Time-stretches/compresses audio to match actual video field boundaries
- Handles dropped fields (tape damage) by skipping corresponding audio
The maths:
Output sample 10000 → maps to input sample 9998.7 (via timing projection)
Nearest-neighbour interpolation is used. Speed differences are tiny, so sample skipping/duplication is rare and inaudible.
For audio alignment to work, you need:
-
Audio file -
.flaccaptured alongside RF -
TBC JSON -
.tbc.jsonproduced by vhs-decode decode step
In the Workflow Control Centre, after decode is complete:
1A # Start audio alignment for project 1
- The alignment tool reads the
.tbc.jsonfor timing data - It reads your captured
.flacaudio - It stretches/compresses audio to match video timing
- Outputs
_aligned.flacfile
Why FLAC, not WAV? The aligned output was originally
.wav, which is fine for short tapes but caps at ~4 GB. At 24-bit / 78125 Hz (the Clockgen-Lite native rate) that's reached on tapes longer than ~2.5 hours, and the alignment would silently truncate. FLAC has no such limit, is read natively byfinal-muxand by Resolve (and similar NLEs) at the resampled output rate, and is lossless so nothing is given up. The toolkit still detects existing_aligned.wavfiles from older runs for backward compatibility, but every new alignment produces_aligned.flac.
After both export and alignment are complete:
1F # Create final muxed output
This combines:
-
ProjectName_ffv1.mkv- Video from export -
ProjectName_aligned.flac- Synchronised audio
Into:
-
ProjectName_final.mkv- Final output with synced audio
If you don't have the Clockgen Lite modification, audio sync is still possible but requires more manual work:
- Capture audio via a separate recorder
- Manual offset - Determine start offset by comparing audio/video
- VhsDecodeAutoAudioAlign - Still handles speed compensation
- Manual verification - May need adjustment
Results won't be as perfect as with Clockgen Lite, but are still much better than no alignment.
The initial offset calibration may be wrong. Run the VHS Timecode Calibration process to measure the correct offset for your hardware setup. The offset is stored in audio_delay in your configuration.
This indicates clock drift between RF and audio capture. Clockgen Lite eliminates this. Without it, VhsDecodeAutoAudioAlign does its best but results may vary.
This usually indicates dropped fields from tape damage. The alignment tool handles this automatically, but severe damage may cause audible artifacts.
If alignment fails with "no timing data", ensure vhs-decode completed successfully and produced a valid .tbc.json file.
CAPTURE
├── ProjectName.lds # RF capture
├── ProjectName.flac # Audio capture
└── ProjectName.json # Capture metadata
DECODE
├── ProjectName.tbc # Time base corrected video
└── ProjectName.tbc.json # TBC metadata with field timing ← Required for alignment
ALIGN
└── ProjectName_aligned.flac # Synchronised audio output
This file contains per-field timing information that makes precision alignment possible:
{
"videoParameters": {
"system": "PAL",
"fieldWidth": 1135,
"fieldHeight": 313
},
"fields": [
{"diskLoc": 0, "fieldNo": 0, "fileLoc": 0, "isFirstField": true},
{"diskLoc": 40000, "fieldNo": 1, "fileLoc": 40000, "isFirstField": false},
...
]
}The fields array maps each video field to its exact position in the RF capture, enabling sample-accurate audio alignment.
The audio is captured at 78125 Hz — the native rate of the clockgen-Lite ADC. This is non-standard and most NLEs (DaVinci Resolve, Premiere, etc.) either reject it outright or resample it themselves at lower quality. The toolkit handles this by resampling once at the final-mux step, using aresample=resampler=soxr:precision=33:osf=s32 for archival-quality conversion on the non-integer ratio.
flowchart LR
flac[(".flac<br/>78125 Hz<br/>master")] --> align
tbcjson[(".tbc.json<br/>field timing")] --> align
align["align<br/>operates at 78125 Hz<br/>(unchanged)"] --> aligned[("_aligned.flac<br/>78125 Hz")]
aligned --> mux
ffv1[("_ffv1.mkv<br/>video only")] --> mux
mux["final-mux<br/>aresample=soxr<br/>(78125 → 96000)"] --> final[("_final.mkv<br/>96 kHz FLAC<br/>(configurable)")]
Resampling happens only at final-mux. Every earlier stage uses 78125 Hz natively. The .flac master is never modified.
- The align tool maps audio samples to RF sample positions in
.tbc.json(which are in 40 MHz units). Its math is built around the 78125 Hz source rate. - Resampling before align would either break the math (wrong samples-per-field count) or require two resamples in series (extra quality loss).
- Keeping align at native rate and resampling once at final-mux preserves the cleanest possible sync calculation and the highest possible audio fidelity.
Two defaults live in config.json under performance_settings:
| Setting | Default | Options |
|---|---|---|
default_audio_resample_rate |
"96000" |
"none", "48000", "96000", "192000"
|
default_audio_format |
"flac" |
"flac", "wav"
|
Default rationale:
- 96000 Hz is the closest common standard above 78125 (next-most-common would be 48000 Hz, which is below the source rate). Resampling up preserves all source bandwidth.
- FLAC is lossless, compressed, and has no file-size limit. The classic WAV format is capped at 4 GB — too small for an 8-hour LP capture at 24/96 stereo. DaVinci Resolve 16+ (released 2019) supports FLAC natively.
Set "none" if you actively want the _final.mkv to keep the native 78125 Hz audio.
Defaults can be overridden per-project via the audio flags in Project Flags:
| Flag | Value | Effect |
|---|---|---|
resample_target |
"none" / "48000" / "96000" / "192000"
|
Per-project resample rate |
audio_format |
"flac" / "wav"
|
Per-project codec |
Legacy boolean flags resample_48k (= "48000") and output_wav (= "wav") are still respected for backward compatibility.
If you have an old capture whose _final.mkv still has 78125 Hz audio and you want to upgrade it without redoing the slow decode/export:
- Confirm
default_audio_resample_rateinconfig.jsonis set how you want (e.g."96000") - Delete the existing
_aligned.flacand_final.mkv - Re-run Align for the project (
1Ain the Workflow Control Centre) - Re-run Final mux for the project (
1F)
The original .flac, .lds, .tbc, and _ffv1.mkv are reused — only the cheap end of the pipeline runs. The new _final.mkv will have audio at the configured rate.
- Use Clockgen Lite - The best results require synchronised capture clocks
- Capture together - Always capture RF and audio from the same playback
- Verify TBC JSON - Ensure decode completed successfully before alignment
- Check the result - Watch a segment of the final output to verify sync
-
Keep originals - Keep your original
.flacin case you need to re-align
- Project Flags
- Segment Mode
- Audio Synchronisation
- VHS Timecode Calibration
- Compress Validation
- Checksums and Verification
Workflow Commands:
-
1D- Decode project 1 -
1M- Compress project 1 -
1E- Export project 1 -
1A- Align audio -
1F- Final mux -
1X- Project settings -
1mv- Validate compressed master (Tier 3) -
hash 1- Hash files lacking a recorded hash -
check 1- Re-hash and compare to log
Key Features:
- PAL/NTSC auto-detect
- Reverse field order (automatic)
- Segment testing mode
- Three-tier compress validation
- Automatic checksums + per-project validation log