Skip to content

Add OGG support and dynamically query supported audio formats#237

Merged
altic-dev merged 3 commits intoaltic-dev:mainfrom
daaain:support-more-formats
Apr 2, 2026
Merged

Add OGG support and dynamically query supported audio formats#237
altic-dev merged 3 commits intoaltic-dev:mainfrom
daaain:support-more-formats

Conversation

@daaain
Copy link
Copy Markdown
Contributor

@daaain daaain commented Mar 30, 2026

Description

Replace hardcoded format lists with a dynamic query to AVURLAsset.audiovisualTypes(), which returns every audio/video format the OS can actually decode. This means new codecs Apple adds in future macOS releases are picked up automatically with zero code changes.

Previously the supported formats were hardcoded in 4 separate places (9 extensions). They are now centralised in MeetingTranscriptionService and derived at launch from AVFoundation.

Changes:

  • MeetingTranscriptionService: query AVURLAsset.audiovisualTypes() and convert to file extensions via UTType; centralise allowedContentTypes, user-facing description, and drop-error copy
  • MeetingTranscriptionView: replace all 4 hardcoded format definitions with references to the service's centralised constants

Type of Change

  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update

Related Issues

Testing

  • Tested on Intel Mac
  • Tested on Apple Silicon Mac
  • Tested on macOS 26.3.1
  • Ran linter locally: swiftlint --strict --config .swiftlint.yml Sources
  • Ran formatter locally: swiftformat --config .swiftformat Sources - I tried, but it created changes in 65 files
  • Built locally: sh build_incremental.sh - I tried, got build_incremental.sh: line 18: /Users/dain/workspace/FluidVoice/build_dev.sh: No such file or directory

Notes

On macOS 26.3.1 AVURLAsset.audiovisualTypes() resolves to 52 formats:

3g2, 3gp, aa, aac, aax, ac3, aifc, aiff, aivu, amr, au, avi, awb,
caf, dv, eac3, flac, itt, loas, m1a, m2a, m2p, m2v, m3u, m4a, m4b,
m4p, m4r, m4v, mod, mov, mp1, mp2, mp3, mp4, mpa, mpg, mqv, mts,
ogg, pls, qta, sc2, scc, ts, ttml, vob, vtt, w64, wav, webvtt, xhe

Screenshots / Video

Add screenshots or Video recording of the app after you have made your changes

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b463ee96f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +75 to +77
let avTypes = AVURLAsset.audiovisualTypes()
let extensions = avTypes.compactMap { UTType($0.rawValue)?.preferredFilenameExtension }
return Set(extensions)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Filter dynamic types to audio formats before validation

The new allowlist is built from AVURLAsset.audiovisualTypes() without filtering, but this API includes non-transcribable types (for example subtitle/playlist formats like vtt, ttml, m3u shown in the commit notes). Because this set gates fileNotSupported, those files now pass extension validation and only fail later at AVAudioFile(forReading:) with an audioConversionFailed error, which is a user-visible regression from the previous explicit unsupported-format path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any clever way to do this filtering or would I need to hardcode the list of non-transcribable types?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a way to filter dynamically, commit

cat <<'EOF' > /tmp/probe3.swift
import AVFoundation
import UniformTypeIdentifiers

let avTypes = AVURLAsset.audiovisualTypes()
let audioOrVideo = avTypes.compactMap { fileType -> String? in
    guard let utType = UTType(fileType.rawValue) else { return nil }
    guard utType.conforms(to: .audio) || utType.conforms(to: .movie) else { return nil }
    return utType.preferredFilenameExtension
}
let sorted = Set(audioOrVideo).sorted()
print("Audio/Video only (\(sorted.count)): \(sorted)")

// Show what got filtered out
let all = avTypes.compactMap { UTType($0.rawValue)?.preferredFilenameExtension }
let excluded = Set(all).subtracting(Set(audioOrVideo)).sorted()
print("\nExcluded (\(excluded.count)): \(excluded)")
EOF
swiftc -sdk $(xcrun --sdk macosx --show-sdk-path) /tmp/probe3.swift -o /tmp/probe3 && /tmp/probe3
Audio/Video only (36): ["3g2", "3gp", "aa", "aac", "aax", "ac3", "aifc", "aiff", "amr", "au", "avi", "caf", "dv", "eac3", "flac", "loas", "m2p", "m2v", "m4a", "m4b", "m4p", "m4r", "m4v", "mod", "mov", "mp2", "mp3", "mp4", "mpg", "mts", "ogg", "qta", "ts", "vob", "w64", "wav"]

Excluded (16): ["aivu", "awb", "itt", "m1a", "m2a", "m3u", "mp1", "mpa", "mqv", "pls", "sc2", "scc", "ttml", "vtt", "webvtt", "xhe"]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the descriptions for the remaining 36, they are all containers that can genuinely have audio:

cat <<'EOF' > /tmp/probe4.swift
import AVFoundation
import UniformTypeIdentifiers

let avTypes = AVURLAsset.audiovisualTypes()
let audioOrVideo: [(String, String, Bool, Bool)] = avTypes.compactMap { fileType in
    guard let utType = UTType(fileType.rawValue) else { return nil }
    guard utType.conforms(to: .audio) || utType.conforms(to: .movie) else { return nil }
    guard let ext = utType.preferredFilenameExtension else { return nil }
    return (ext, utType.localizedDescription ?? "?", utType.conforms(to: .audio), utType.conforms(to: .movie))
}
for (ext, desc, isAudio, _) in audioOrVideo.sorted(by: { $0.0 < $1.0 }) {
    let kind = isAudio ? "audio" : "video"
    print("\(ext.padding(toLength: 6, withPad: " ", startingAt: 0)) \(kind.padding(toLength: 6, withPad: " ", startingAt: 0)) \(desc)")
}
EOF
swiftc -sdk $(xcrun --sdk macosx --show-sdk-path) /tmp/probe4.swift -o /tmp/probe4 && /tmp/probe4
3g2    video  3GPP2 movie
3gp    video  3GPP movie
aa     audio  Audible.com Audiobook
aac    audio  AAC audio
aax    audio  Audible.com Audiobook
ac3    audio  AC-3 audio
aifc   audio  AIFF-C audio
aiff   audio  AIFF audio
amr    audio  Adaptive Multi-rate audio
au     audio  AU audio
avi    video  AVI movie
caf    audio  Apple CoreAudio format
dv     video  DV movie
eac3   audio  Enhanced AC-3 audio
flac   audio  FLAC audio
loas   audio  Low Overhead MPEG-4 Audio Stream
m2p    video  MPEG-2 Stream
m2v    video  MPEG-2 video
m4a    audio  Apple MPEG-4 audio
m4b    audio  protected MPEG-4 audio
m4p    audio  protected MPEG-4 audio
m4r    audio  Ringtone
m4v    video  Apple MPEG-4 movie
mod    audio  MOD Audio File
mov    video  QuickTime movie
mp2    audio  MP2 audio
mp2    audio  MP2 audio
mp3    audio  MP3 audio
mp4    video  MPEG-4 movie
mp4    audio  MPEG-4 audio
mpg    video  MPEG movie
mts    video  AVCHD MPEG-2 Transport Stream
ogg    audio  Ogg Audio
qta    audio  QuickTime Audio
ts     video  MPEG-2 Transport Stream
vob    video  VOB File (DVD Video)
w64    audio  Wave64 Audio
wav    audio  Waveform audio

@altic-dev
Copy link
Copy Markdown
Owner

Thanks for the PR! I think this creates a conflict with @grohith327 's change which I merged already as it has other important fixes too. You guys figure it out and lmk how to proceed lmao. Keeping it open for you both. Thanks!

daaain added 2 commits April 1, 2026 15:40
Replace hardcoded format lists with a dynamic query to
AVURLAsset.audiovisualTypes(), which returns every audio/video format
the OS can actually decode. This means new codecs Apple adds in future
macOS releases are picked up automatically with zero code changes.

Closes altic-dev#213

Previously the supported formats were hardcoded in 4 separate places
(9 extensions). They are now centralised in MeetingTranscriptionService
and derived at launch from AVFoundation.

On macOS 26 this resolves to 52 formats:

  3g2, 3gp, aa, aac, aax, ac3, aifc, aiff, aivu, amr, au, avi, awb,
  caf, dv, eac3, flac, itt, loas, m1a, m2a, m2p, m2v, m3u, m4a, m4b,
  m4p, m4r, m4v, mod, mov, mp1, mp2, mp3, mp4, mpa, mpg, mqv, mts,
  ogg, pls, qta, sc2, scc, ts, ttml, vob, vtt, w64, wav, webvtt, xhe

Notably this includes OGG (via org.xiph.ogg-audio), which macOS 26
decodes natively through AVFoundation — no third-party dependency needed.

Changes:
- MeetingTranscriptionService: query AVURLAsset.audiovisualTypes() and
  convert to file extensions via UTType; centralise allowedContentTypes,
  user-facing description, and drop-error copy
- MeetingTranscriptionView: replace all 4 hardcoded format definitions
  with references to the service's centralised constants

# Conflicts:
#	Sources/Fluid/Services/MeetingTranscriptionService.swift
#	Sources/Fluid/UI/MeetingTranscriptionView.swift
@daaain daaain force-pushed the support-more-formats branch from 25153e5 to 756d217 Compare April 1, 2026 14:37
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 756d21714e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

let avTypes = AVURLAsset.audiovisualTypes()
let extensions = avTypes.compactMap { fileType -> String? in
guard let utType = UTType(fileType.rawValue) else { return nil }
guard utType.conforms(to: .audio) || utType.conforms(to: .movie) else { return nil }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict movie extensions to containers the pipeline can handle

Including every UTType that conforms to .movie in supportedFileExtensions admits many new video extensions (e.g. avi, mts, vob) that were previously rejected, but downstream logic still treats only mp4/mov as video containers (isVideoContainer = ["mp4", "mov"]). For providers with prefersNativeFileTranscription == true, those newly admitted movie files now take the native path instead of the buffered video path and can fail at runtime, turning a deterministic fileNotSupported rejection into a later transcription/conversion error.

Useful? React with 👍 / 👎.

@daaain
Copy link
Copy Markdown
Contributor Author

daaain commented Apr 1, 2026

OK, I rebased this to latest main and converted an OGG I had to a few different formats with ffmpeg and seems to work beautifully:

image image image

So @altic-dev if you want a more general audio / video support and a bit of refactoring to centralise the format support text labels, this is still worth merging.

@altic-dev
Copy link
Copy Markdown
Owner

Merging it :) Thank you!

@altic-dev altic-dev merged commit 51c6078 into altic-dev:main Apr 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[✨ FEATURE] OGG support for local (Parakeet) transcription

2 participants