Fix Ctrl+C cancellation race in baml-cli test by rossirpaulo · Pull Request #3083 · BoundaryML/baml

rossirpaulo · 2026-02-06T03:48:27Z

Summary

fix a Ctrl+C cancellation race in baml-cli test by using Notify::notify_one() instead of notify_waiters()
add focused regression tests that document Notify behavior when the signal arrives before a waiter is registered
keep the runtime executor path unchanged

Issue

notify_waiters() only wakes tasks already waiting on notified() and does not persist a permit.

In Commands::Test, the SIGINT handler can run before the test executor reaches notified().await. When that timing happens, the cancellation signal is dropped. Result: Ctrl+C can be ignored and the test run can continue hanging.

Why this fix

notify_one() stores a permit when there is no active waiter, so a later notified().await will complete immediately. That removes the race while preserving current cancellation flow.

Concrete evidence

buggy call site before fix: engine/cli/src/commands.rs used cancel_clone.notify_waiters() in the Ctrl+C handler
runtime awaits cancellation via notify.notified().await in engine/baml-runtime/src/test_executor/mod.rs
race exists when signal is emitted before waiter registration

Validation

RUSTUP_TOOLCHAIN=stable-aarch64-apple-darwin cargo test -p baml-cli notify_ -- --nocapture
RUSTUP_TOOLCHAIN=stable-aarch64-apple-darwin cargo test -p baml-runtime cancel_notify_returns_cancelled -- --nocapture

Summary by CodeRabbit

Tests
- Added tests validating cancellation signal handling in CLI test execution.
Chores
- Improved internal implementation of signal cancellation handling to ensure proper signal persistence and reliability during operation interruption.

vercel · 2026-02-06T03:48:34Z

@rossirpaulo is attempting to deploy a commit to the Boundary Team on Vercel.

A member of the Team first needs to authorize it.

coderabbitai · 2026-02-06T03:48:50Z

📝 Walkthrough

Walkthrough

The change replaces the Ctrl-C handler's use of notify_waiters() with a new emit_cancel_signal() helper that calls notify_one() instead. This ensures a persistent cancellation signal is available for future waiters. Tests validate the difference between these notify behaviors.

Changes

Cohort / File(s)	Summary
Ctrl-C Handler and Cancellation Signal `engine/cli/src/commands.rs`	Replaces `notify_waiters()` with `notify_one()` via new helper function to ensure persistent cancellation signal for CLI execution; adds tests validating Notify behavior differences.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Add cancel_notify support and test for Ctrl+C in test executor #3074: Wires a cancel_notify through the CLI into the test executor, complementing this PR's changes to the Ctrl-C handler's cancellation signal mechanism.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix Ctrl+C cancellation race in baml-cli test' directly and accurately describes the main change: fixing a race condition in the Ctrl+C cancellation signal handling for the baml-cli test command.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

coderabbitai · 2026-02-06T03:54:46Z

engine/cli/src/commands.rs

+fn emit_cancel_signal(cancel_notify: &tokio::sync::Notify) {
+    // Persist one permit in case SIGINT arrives before the executor starts awaiting.
+    cancel_notify.notify_one();
+}


🧹 Nitpick | 🔵 Trivial

Confirm the single-waiter assumption for this Notify.

notify_one() only wakes one waiter; if multiple tasks await the cancel token, others could miss cancellation. If multi-waiter is possible, consider broadcasting to current waiters and still persisting a permit for future ones.

Suggested adjustment to preserve broadcast + persistence

fn emit_cancel_signal(cancel_notify: &tokio::sync::Notify) { // Persist one permit in case SIGINT arrives before the executor starts awaiting. - cancel_notify.notify_one(); + cancel_notify.notify_waiters(); + cancel_notify.notify_one(); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

fn emit_cancel_signal(cancel_notify: &tokio::sync::Notify) {

// Persist one permit in case SIGINT arrives before the executor starts awaiting.

cancel_notify.notify_one();

}

fn emit_cancel_signal(cancel_notify: &tokio::sync::Notify) {

// Persist one permit in case SIGINT arrives before the executor starts awaiting.

cancel_notify.notify_waiters();

cancel_notify.notify_one();

}

So with current architecture, multi-waiter miss is not happening.

Your concern is valid for future design though: if multiple tasks start awaiting the same cancel token, Notify becomes awkward for “wake all current waiters + stay cancelled for future waiters.” Naively combining notify_waiters() and notify_one() still has race windows. For that future case, a sticky broadcast cancellation primitive (for example CancellationToken-style semantics) is the safer model.

@rossirpaulo, thank you for the clarification! You're absolutely right—the single-waiter design in the current architecture makes notify_one() the correct choice here, and my suggestion doesn't apply.

I appreciate you noting the future consideration as well. If the architecture ever evolves to support multiple concurrent waiters on the same cancellation signal, a dedicated primitive like tokio_util::sync::CancellationToken would indeed be the safer path forward, as it provides the "sticky broadcast" semantics you described without the race windows.

coderabbitai · 2026-02-06T03:54:46Z

engine/cli/src/commands.rs

+#[cfg(test)]
+mod tests {
+    use std::time::Duration;
+
+    #[tokio::test]
+    async fn notify_waiters_drops_signal_if_emitted_before_waiting() {
+        let notify = tokio::sync::Notify::new();
+        notify.notify_waiters();
+
+        let result = tokio::time::timeout(Duration::from_millis(20), notify.notified()).await;
+        assert!(
+            result.is_err(),
+            "notify_waiters should not persist signals for future waiters",
+        );
+    }
+
+    #[tokio::test]
+    async fn notify_one_persists_signal_if_emitted_before_waiting() {
+        let notify = tokio::sync::Notify::new();
+        super::emit_cancel_signal(&notify);
+
+        let result = tokio::time::timeout(Duration::from_millis(20), notify.notified()).await;
+        assert!(
+            result.is_ok(),
+            "notify_one should persist one signal for a future waiter",
+        );
+    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, verify the file and code at the specified location cat -n engine/cli/src/commands.rs | sed -n '356,382p'

Repository: BoundaryML/baml

Length of output: 1122

🏁 Script executed:

# Check for other timeout patterns in the codebase to understand conventions rg "Duration::from_millis" --type rust -A 2 -B 2

Repository: BoundaryML/baml

Length of output: 33354

🏁 Script executed:

# Also check for tokio test patterns rg "tokio::time::timeout" --type rust -A 2 -B 2

Repository: BoundaryML/baml

Length of output: 1115

Increase timeout to reduce CI flakiness in unit tests.

These 20ms timeouts are prone to flakiness under CI load. Increase to 200ms (or higher) to provide a safety margin for OS scheduling overhead without affecting test correctness.

Example: increase timeout

- let result = tokio::time::timeout(Duration::from_millis(20), notify.notified()).await; + let result = tokio::time::timeout(Duration::from_millis(200), notify.notified()).await;

Fix Ctrl+C cancel signal race in CLI test command

f67ff44

rossirpaulo added the codex label Feb 6, 2026

rossirpaulo removed the codex label Feb 6, 2026

rossirpaulo requested a review from hellovai February 6, 2026 03:51

coderabbitai bot requested changes Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Ctrl+C cancellation race in baml-cli test#3083

Fix Ctrl+C cancellation race in baml-cli test#3083
rossirpaulo wants to merge 1 commit intoBoundaryML:canaryfrom
rossirpaulo:paulo/fix-ctrlc-notify-race

rossirpaulo commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 6, 2026 •

edited

Loading

Uh oh!

rossirpaulo Feb 6, 2026

Uh oh!

coderabbitai bot Feb 6, 2026

Uh oh!

coderabbitai bot Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rossirpaulo commented Feb 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issue

Why this fix

Concrete evidence

Validation

Summary by CodeRabbit

Uh oh!

vercel bot commented Feb 6, 2026

Uh oh!

coderabbitai bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rossirpaulo Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rossirpaulo commented Feb 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 6, 2026 •

edited

Loading

coderabbitai bot Feb 6, 2026 •

edited

Loading