Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ build-conan/
module.tar.gz
audio-module
settings.json
CLAUDE.md
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,25 @@ The following attributes are available for the `viam:audio:speaker` model:
| `latency` | int | **Optional** | Suggested output latency in milliseconds. This controls how much audio PortAudio buffers before making it available. Lower values (5-20ms) provide faster audio output but use more CPU time. Higher values (50-100ms) are more stable but less responsive. If not specified, uses the device's default low latency setting (typically 10-20ms). |
| `volume` | int | **Optional** | Output volume as percentage (0-100). Supported on Linux devices only. On macOS, use the system volume controls (keyboard keys). |

#### DoCommand

The speaker supports the following DoCommands:

**`set_volume`** — Set the speaker output volume.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to document this before so adding now

```json
{"set_volume": 75}
```
- Value must be between 0 and 100.
- **Linux only.** On macOS, use the system volume controls (keyboard keys).
- Returns: `{"volume": 75}`

**`stop`** — Immediately stop audio playback.
```json
{"stop": true}
```
- Interrupts any in-progress `Play` call and silences the output.
- Returns: `{"stopped": true}`

## Model viam:audio:discovery

This model is used to discover audio devices on your machine.
Expand Down
16 changes: 16 additions & 0 deletions src/speaker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,17 @@ viam::sdk::ProtoStruct Speaker::do_command(const viam::sdk::ProtoStruct& command
return viam::sdk::ProtoStruct{{"volume", static_cast<double>(vol)}};
}

if (command.count("stop")) {
VIAM_SDK_LOG(info) << "Stop command received, interrupting playback";
stop_requested_.store(true);
// Advance playback position to write position so no more audio is played.
std::lock_guard<std::mutex> lock(stream_mu_);
if (audio_context_) {
audio_context_->playback_position.store(audio_context_->get_write_position(), std::memory_order_relaxed);
Copy link

@dgottlieb dgottlieb Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using memory_order_relaxed is correct here. Unless we truly don't care what the value is. If the reader of playback_position also happened to acquire the stream_mu_, this could be accidentally right (using a mutex + atomic variables is awkward, I don't believe we're intentionally trying to rely on some memory ordering relationship there).

In general, one shouldn't need to play with the default memory_order_seq_cst that these stores and loads use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops sorry I merged before I saw this, ill make a ticket to remove the memory_order usages and verify there's no performance regression in the audio callback.

}
return viam::sdk::ProtoStruct{{"stopped", true}};
}

throw std::invalid_argument("unknown command");
}

Expand All @@ -189,6 +200,7 @@ void Speaker::play(std::vector<uint8_t> const& audio_data,
boost::optional<viam::sdk::audio_info> info,
const viam::sdk::ProtoStruct& extra) {
std::lock_guard<std::mutex> playback_lock(playback_mu_);
stop_requested_.store(false);

VIAM_SDK_LOG(debug) << "Play called, adding samples to playback buffer";

Expand Down Expand Up @@ -307,6 +319,10 @@ void Speaker::play(std::vector<uint8_t> const& audio_data,
// Block until playback position catches up
VIAM_SDK_LOG(debug) << "Waiting for playback to complete...";
while (playback_context->playback_position.load() - start_position < final_num_samples) {
if (stop_requested_.load()) {
VIAM_SDK_LOG(debug) << "Playback stopped by stop command";
return;
}
// Check if context changed (reconfigure happened)
{
std::lock_guard<std::mutex> lock(stream_mu_);
Expand Down
3 changes: 3 additions & 0 deletions src/speaker.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,9 @@ class Speaker final : public viam::sdk::AudioOut, public viam::sdk::Reconfigurab

// Audio context for speaker playback (includes buffer and playback position tracking)
std::shared_ptr<audio::OutputStreamContext> audio_context_;

// Flag to interrupt playback
std::atomic<bool> stop_requested_{false};
};

} // namespace speaker
19 changes: 19 additions & 0 deletions test/speaker_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,25 @@ TEST_F(SpeakerTest, DoCommandSetVolumeOutOfRange) {
EXPECT_THROW(speaker.do_command(command_low), std::invalid_argument);
}


TEST_F(SpeakerTest, DoCommandStop) {
auto attributes = ProtoStruct{};
ResourceConfig config(
"rdk:component:speaker", "", test_name_, attributes, "",
speaker::Speaker::model, LinkConfig{}, log_level::info);

Dependencies deps{};
speaker::Speaker speaker(deps, config, mock_pa_.get());

ProtoStruct command{{"stop", true}};
auto result = speaker.do_command(command);


ASSERT_TRUE(result.count("stopped"));
EXPECT_EQ(speaker.stop_requested_, true);
}


TEST_F(SpeakerTest, DoCommandUnknown) {
auto attributes = ProtoStruct{};
ResourceConfig config(
Expand Down