A pure-Go module for speech-to-text transcription, with one unified
Transcriber interface and three implementations:
- Wyoming — TCP wire protocol used by the Home Assistant voice
ecosystem (e.g.
wyoming-faster-whisper). JSON-header + binary-payload framing, implemented from the wire format directly. - OpenAI — HTTPS multipart POST to
/v1/audio/transcriptions, with Bearer-token auth and configurable endpoint. - whisper.cpp — the same multipart protocol shape pointed at a local
whisper-serverover loopback HTTP, no auth.
import (
"github.com/matthewjhunter/asrclient"
"github.com/matthewjhunter/asrclient/wyoming"
)
c := wyoming.NewClient("localhost:10300")
defer c.Close()
tr, err := c.Transcribe(ctx, pcm, asrclient.Options{Language: "en"})
// tr.Text, tr.Language, tr.DecodeDuration, tr.Segments- Pure Go, no CGo.
CGO_ENABLED=0builds clean; the race detector (which requires CGo) is opt-in viatask test:race. - Protocol clients only. Spawning
whisper-server, port discovery, restart-on-crash,/healthgating — none of that lives here. The whisper.cpp client expects a server already running and reachable. Keeping protocol and lifecycle separate is the reason the module exists as its own thing. - Narrow surface.
Transcriber,Options,Transcript,Segmentare the public types; backend constructors areNewClient(...)with optionalWithX(...)options. No speculative fields — they're added when a real consumer needs them.
The v0.x series is not API-stable. asrclient and its primary
consumer (dicta) are both early-stage; expect breaking renames or
shape changes between v0.x minor versions as both projects shake
out. Pin a specific version in your go.mod. The interface will
settle and a v1.0 will follow once a few consumers have stress-tested
it.
Specifically deferred for a later v0.x:
- Streaming / interim transcripts. The current
Transcribecall buffers a full utterance and returns one finalTranscript. Wyoming actually supports incremental audio in and interim transcripts out; OpenAI's incremental story is the separate Realtime API (WebSocket); whisper-server doesn't stream. When a consumer needs live captioning or partial results, an opt-inStreamingTranscriberinterface will likely land — only the Wyoming backend will implement it; callers will feature-detect via type assertion.
All callers and backends assume the same PCM shape:
| Field | Value |
|---|---|
| Sample rate | 16 kHz |
| Channels | mono |
| Sample format | int16 little-endian |
| Frame size | 80 ms / 1280 samples / 2560 bytes |
Constants live in audio.go (SampleRateHz, Channels, SampleWidth,
FrameMS, FrameSamples, FrameBytes). This matches the conventions
of the Wyoming voice-services ecosystem and openWakeWord, so consumers
that already produce frames in this format pay no resampling cost.
go get github.com/matthewjhunter/asrclientc := wyoming.NewClient("localhost:10300",
wyoming.WithDialTimeout(5*time.Second))
defer c.Close()One TCP connection per Client, opened lazily on the first call. On a
transport error the connection is dropped and the next call redials.
No background reconnect goroutine.
c := openai.NewClient(os.Getenv("OPENAI_API_KEY"),
openai.WithModel("whisper-1"),
openai.WithTimeout(30*time.Second))
defer c.Close()API key is sent as Bearer auth; pass "" to omit the header for
private deployments that accept anonymous traffic. TLS verification is
on by default; WithTLSInsecureSkipVerify() is available for local-LAN
testing only.
c := whispercpp.NewClient(
whispercpp.WithEndpoint("http://127.0.0.1:8081/v1/audio/transcriptions"))
defer c.Close()No auth header is sent. The model field is required by the protocol but
ignored by whisper-server; the default "whisper-1" is the
conventional placeholder.
asrclient/
├── client.go # Transcriber, Options, Transcript, Segment
├── audio.go # frame-format constants
├── wyoming/ # Wyoming wire protocol + Transcriber impl
├── openai/ # OpenAI HTTPS Transcriber
├── whispercpp/ # OpenAI-protocol Transcriber, loopback defaults
└── internal/
└── httpcore/ # shared multipart/form-data POST core
wyoming/ keeps zero non-stdlib imports beyond the parent package's
types — it can be lifted into its own module if a consumer ever needs
just the wire protocol without the rest of asrclient.
task # list tasks
task test # go test ./...
task test:race # go test -race ./... (requires CGo)
task check # vet + fmt + lint + test + vulnSee CONTRIBUTING.md for the contribution guide and SECURITY.md for vulnerability reporting.
Apache-2.0 — see LICENSE.