Decode Olympus DSS/DS2 dictation files anywhere β pure CLI, no Windows, no GUI, no commercial software. A production integration recipe for a format that was locked for ten years. π
git clone https://github.com/Guillain-RDCDE/DS2-Anywhere
cd DS2-Anywhere
docker compose up --build
# Web UI: http://localhost:8080/convertisseur.php
# HTTP API: http://localhost:8765/healthDrop a .ds2 or .dss into examples/ (the folder is mounted into both containers), then go to the web UI and convert.
sudo ./src/bin/install.shAsks the half-dozen questions that actually matter β install dir, audio root, DB creds if you want mode 2, alert email β then drops the config, the cron, the systemd unit and the CLI symlink in place. Run it again, nothing breaks.
In 2007, Olympus released the Digital Speech Standard 2 (.ds2) format for their professional dictation recorders. Proprietary codec, no public spec, no open decoder. Anyone who needed to process .ds2 files on Linux or macOS had exactly one path: run Olympus DSS Player or NCH Switch through a Windows VM.
In 2017, FFmpeg ticket #6091 was opened: "Add DS2 codec support". It sat unimplemented for nine years.
In February 2026, Kieran Hirpara published hirparak/dss-codec β the first open-source DS2 decoder, reverse-engineered from the Olympus DLLs using Ghidra, and verified byte-for-byte against the official Olympus DirectShow filter.
Three months later, this repo shows how to take that work and put it in production: replacing a fragile Windows VM + commercial GUI software with a bash wrapper, a cron, and ~150 lines of glue.
- π docs/ β a long-form, didactic walkthrough of:
- (1) how the codec was reverse-engineered (the genius part β not ours);
- (2) how we integrated it into a production transcription pipeline that processes real-world dictations daily (the engineering part β ours);
- (3) the empty-block bug β a decoder that was bit-exact on every file we tested and still wrong on paused recordings, the ten dead ends, and the twelve-line fix (a detective story worth reading even if you never touch DS2);
- (4) cracking the re-sync block β the sequel: we ran the closed-source Olympus decoder inside a debugger we built from its own DLLs, hooked it at the instruction level, and read the format's last undocumented demux rule straight off the silicon β then deleted the Windows fallback for good;
- (5) the bug that wasn't β the saga's twist, and the chapter we're proudest of. A residual "decoder bug" on paused recordings was cornered across a full research paper β analysis-by-synthesis proving the filter bit-exact, nine falsified hypotheses, a hidden state machine β and then overturned. We did what the paper said was impossible: ran the closed Olympus decoder under our own instrumentation (Linux + Wine + gdb), watched a reference lie to us in the exact shape of the symptom, and finally settled it the cheapest way there is β by listening. There was no bug; the "seven-second wound" was a person stepping away from the microphone. We kept every wrong turn in the record, framed. The most honest read in the repo, and the most useful if you reverse-engineer for a living.
- π src/ β the actual integration code: CLI, cron job, HTTP daemon, admin web UI. Sanitized of organization-specific bits; the patterns are reusable as-is.
- π benchmarks/ β performance comparison (WASM vs native, the chain we use vs the commercial Windows chain), and the validation campaign run on 35 real-world files.
BEFORE AFTER
.ds2 ββΊ SSHFS ββΊ Windows VM ββΊ Switch.exe .ds2 ββΊ cron (Linux)
β β
βΌ βΌ
.wav (mono) .mp3 (mono 64k)
β β
βΌ βΌ
SSHFS back Whisper API
β β
βΌ βΌ
glue script ready for transcription
β
βΌ
Whisper API
GUI app + Windows VM + SSHFS round-trip bash + native binary, all local, ~10s/file
Convert one file:
conv-dss-ds2-to-mp3 recording.ds2
# [ds2_qp 16000Hz, 31.8min] recording.mp3 OK (14.55 Mo en 10.3 s)Inspect a file without decoding:
conv-dss-ds2-to-mp3 --inspect recording.ds2
# format : ds2_qp
# chiffrement : none
# freq. nat. : 16000 Hz
# taille : 6754304 octetsEncrypted DS2 with password:
conv-dss-ds2-to-mp3 --password=mypwd recording.ds2Full install + cron + web UI setup: docs/02-integration.md.
The decision to ship was based on an A/B against the reference Windows implementation on the same source file, not just a count of successful decodes. The full validation, in order of weight:
- A/B vs Switch.exe (same
.ds2, both chains, both MP3s through the same Whisper API): transcripts are functionally identical. Switch.exe: 16.2 % low-confidence words. Our chain: 17.2 %. Within Whisper's own run-to-run variance. The two chains are interchangeable for any downstream pipeline. - Sample: 35 real production dictations (32Γ DS2 QP + 3Γ DSS SP, 6 h 48 of audio total). 35 / 35 decoded successfully, zero failures. Sample is intentionally tight β DS2 files don't survive long in our pipeline (raw uploads archived after ~2 weeks), and the A/B against the reference was what carried the call, not the headcount.
- Production: ~3 200 cron passes since go-live, zero errors logged. Every new DS2 entering the system now goes through this chain. The Switch VM stays on standby, untouched.
Full methodology and results β
If you haveβ¦
- 5 minutes β just read this README.
- 20 minutes β docs/01-reverse-engineering.md β the genius part.
- 30 minutes (the detective stories) β docs/06 + docs/07 β two production bugs hunted to ground, one of them by running the closed-source decoder inside a debugger we built from its own DLLs. Reads like fiction; every line happened.
- the twist β π docs/09 β the research paper then docs/10 β the reckoning β a rigorous case for a "last bug," then the chapter that overturns it: an instrumentable oracle built from the vendor's own DLLs, a reference that lied, and the cheapest decisive test in engineering. How careful work can be confidently wrong β and how to catch it.
- 60 minutes β all of docs/, in order. From "impossible for ten years" to "production in a weekend".
The intellectual heavy-lifting belongs entirely to:
- Kieran Hirpara β hirparak/dss-codec β the reverse-engineering that made all of this possible. MIT, February 2026.
- Gaspard Petit β dss-codec-wasm (WASM build) and dss-codec fork (Rust crate with CI, streaming, decryption β the one our Dockerfile uses). MIT.
- Patrick Domack β FFmpeg C port gist β independent C implementation of the spec, being prepared for upstream FFmpeg submission from this repo. MIT / public domain.
- lamejs (@breezystack/lamejs) β pure-JS MP3 encoder. LGPL.
- FFmpeg β the encoder we use in the native chain. LGPL.
This repo is a recipe. The recipe needs the ingredients above to exist at all. Full credit breakdown: CREDITS.md.
MIT. Same as the upstream codec. Fork, adapt, deploy, integrate β only please keep proper attribution to the codec authors.
A decade of impossible, one bash command later. π