-
Notifications
You must be signed in to change notification settings - Fork 0
feat: support custom audio track input #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for custom audio track input to the Anam SDK, enabling developers to provide their own audio source for speech-to-text processing instead of relying on the default microphone input.
Changes:
- Added
audio_input_trackparameter toClientOptionsfor configuring custom audio input - Updated the WebRTC connection setup to use the custom audio track when provided
- Exported
AudioStreamTrackandAudioFrameclasses from the main package for easier access
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/anam/types.py | Added audio_input_track field to ClientOptions with documentation |
| src/anam/client.py | Passed audio_input_track option to the streaming client |
| src/anam/_streaming.py | Implemented custom audio track handling in WebRTC setup and cleanup |
| src/anam/init.py | Exported AudioStreamTrack and AudioFrame for public API access |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 4 files
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… buffer on recv, no flush on connect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="src/anam/_streaming.py">
<violation number="1" location="src/anam/_streaming.py:720">
P2: The upper-bound check uses 4800 instead of 48000, causing the warning to fire for all valid sample rates above 4.8 kHz. This makes the warning misleading and noisy for normal inputs.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| ) | ||
| if ( | ||
| sample_rate < 16000 | ||
| or sample_rate > 4800 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: The upper-bound check uses 4800 instead of 48000, causing the warning to fire for all valid sample rates above 4.8 kHz. This makes the warning misleading and noisy for normal inputs.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/anam/_streaming.py, line 720:
<comment>The upper-bound check uses 4800 instead of 48000, causing the warning to fire for all valid sample rates above 4.8 kHz. This makes the warning misleading and noisy for normal inputs.</comment>
<file context>
@@ -706,13 +706,24 @@ def send_user_audio(
)
+ if (
+ sample_rate < 16000
+ or sample_rate > 4800
+ or sample_rate not in [16000, 24000, 32000, 44100, 48000]
+ ):
</file context>
| or sample_rate > 4800 | |
| or sample_rate > 48000 |
Summary by cubic
Adds support for sending user audio via a new send_user_audio(...) API that takes raw 16-bit PCM. Default behavior is unchanged when input is disabled or when no audio is sent.
Written for commit 54d09c4. Summary will update on new commits.